Advice Regarding SQL Data Storage

Discussion in 'Plugin Development' started by Scizzr, Sep 15, 2013.

Thread Status:
Not open for further replies.
  1. Offline

    Scizzr

    So I'm working on a (yes, yet another) stat plugin. I've got everything working and everything is fine. But I feel that the logic of saving the stats could be better. What I've got right now is a bunch of enums (think like COALORE_BROKEN, IRONORE_BROKEN, etc) and an object that keeps track of these breaks. In my SQL table, I've got every single player with every single stat on a different row. So I've got a row for Joe's COALORE_BROKEN, and another one for his IRONORE_BROKEN, and yet more for other things. I feel that it would be more both more friendly to future updates and more friendly to admins (if something needed to be adjusted) if I simply made a string that contains all of these things, and that is completely possible. The problem is that the string would be so long once players started breaking different types of blocks. Here's an example of what an entry could look like.

    Column: Value
    -----
    username: Joe
    blocks_broken: 1:0;1|2:0;2|17:3;1

    So basically, it would be a simple pipe-delimited string, with the format ID : DATA ; AMOUNT for each block. That's cool and all, but I'm just not sure that's the best way to do this. I mean, that would have only one row for each user, but I don't know if having a 10,000 character VARCHAR is reasonable.

    Thoughts?
     
  2. Offline

    tommycake50

    My thoughts are that your current way is better.
     
    Scizzr likes this.
  3. Offline

    Scizzr

    Any particular reason? Sorry, just trying to get advice.
     
  4. Offline

    tommycake50

    Well its efficiency over friendliness.
    I would always go for efficiency.
     
    Scizzr likes this.
  5. Offline

    uyuyuy99

    tommycake50 I'm not sure his current way is better. First of all, why do you think it's less efficient? If the values are only loaded on startup or when a player logs in, efficiency would be perfectly fine using a large string. Also, there's no reason to make a separate enum value for every single different combination of block/action - there must be hundreds upon hundreds of enum values. Using a giant string is, admittedly, a weird way of doing it, but at least it's flexible.

    Scizzr Also, if you choose to go with the giant string method, you should be using the TEXT data type in mysql, not VARCHAR.
     
    Scizzr likes this.
  6. Offline

    Scizzr

    I've always heard that TEXT is not desirable, and that CHAR is the way to go.

    A quick search also shows that TEXT is "deprecated".

    http://technet.microsoft.com/en-us/library/ms187993.aspx
    Edit: Also, you are correct. The stats are reloaded when the player logs in. Currently, I have a Stat object that is saved on a "has been changed" basis, using a simple boolean to keep track of that. So there's another thing to consider, how much data will be being passed across the network at any one given point. Let's say, for example, that the player has broken every single block in the game 10,000 times. That's REALLY over-exaggerating, but it's possible that some lonely person could achieve that (or some combination of breaks that could total that).

    Let me assume some things, giving a lot of room
    • There are 140 different placeable blocks (multiple data values count as 1 block)
    • We can assume the max data value to be 2 numbers long (0 to 99)
    • We will assume that there are 200 blocks, and give some room:
    ○ All 200 blocks have 3-digit IDs (3)
    ○ All 200 block can have a max data value of 99 (2)
    ○ The max number of a specific block will not exceed 99,999 (5)
    ○ We need to divide the data value with a colon for data and a semicolon for number (2)
    • All of this together would be: 200*(3+2+5+2) = 200*12 = 2400

    So really, I could either have 140 enums for the different blocks, or a VARCHAR of length 2,400 (we'll call it 2,500 to round up). And I could even be so intelligent as to have the plugin check this string length when it's being written to the database and have it automatically message admins/ops/whoever. That's not so bad.

    The real difference in this all will be that I can't do something like
    Code:
    int stone = UserManager.getUser("Joe").getStat(StatType.STONE_BROKEN, 0).getValue();
    //0 is the data value
    
    and instead will do something like
    Code:
    int stone = UserManager.getUser("Joe").getBlocksBroken(1, 0);
    //1 is the block id
    //0 is the data value
    
    But again, this allows me to append to the end if the value is not already in the list, or whatever. It's just going to lead to huge queries.
     
  7. Offline

    uyuyuy99

    And a longer search would show that it's not. Unless you're using "Microsoft SQL Server" (I assume you're actually using MySQL), then no, TEXT is certainly not deprecated, and in fact is used quite frequently. Because you will not need to perform searches via SQL on this giant string, you won't use it as an index, and it will be extremely long, it would be wise to store it as a TEXT field.
     
    Scizzr likes this.
  8. Offline

    beastman3226

    MEMO holds over 65,000 characters. Should be plenty. However, you could just use a table per player. Get the last date of update to the table and if it is within a certain range, delete the table.
     
    Scizzr likes this.
  9. Offline

    macguy8

    I believe your current way is better for the following reasons.

    It'd be easier to see if you are looking at the database from the console
    It's more efficient to get / set one value than to get a value, parse a string, change one part of it, and then reformat it.
     
    Scizzr likes this.
  10. Offline

    lycano

    Scizzr when designing a database never go for "more friendly to admins" always think of a good concept first then try to think from what perspective things happen more often. Like "do you plan to lookup a block at a position then check the stats for the type of the block OR do you only want to allow user stats"

    In any case an efficient way to accomplish your goal would be using different tables for different types. As the data you store can be grow fast you need to make use of indexing and query cache.

    Things that will be queried often will be held back for quite a time so you should make use of that and by putting each type of action into a different table.

    I usually split data into the following tables:

    PlayerNames, PlayerAttributeName, PlayerAttribute_EventType,

    For each EventType i would create a new attribute table that contains all attributes for that Event.

    Example:

    Table: PlayerNames
    Fields: PlayerId, PlayerName
    Data: 1, Player

    PlayerId is Unique

    Table: PlayerAttributeName
    Fields: PlayerAttributeId, PlayerAttributeName
    Data: 1, BlockBreak
    Data: 2, EntityDamage

    PlayerAttributeId is Unique

    Table: PlayerAttribute_BlockBreak
    Fields: AttrId, PlayerId, Key, Value
    Data: 1, 1, DIRT, 1

    Note: Key is the Material Name
    AttrId and PlayerId is Unique

    Table: PlayerAttribute_EntityDamage
    Fields: AttrId, PlayerId, Key, Value
    Data: 1, 1, Entity, 34

    Note: Key is Type of Entity so if he did take damage from another Player you could also add
    1, 1, Player, 12

    If you want to add the cause of damage also i would create another table where you add a new damage cause everytime and only link ids.

    Note: Player is not the name of the Player its the ClassType.

    You could create a view to link all data together but technically you can always left join your data together. If you think thats too much for each user then go and create a PlayerData Table that links all data together.

    PlayerTransactions (MainTable)
    TransactionID, PlayerId, AttributeName, AttrId
    1, 1, 1, 1
     
    Scizzr likes this.
  11. Offline

    1Rogue

    It hasn't been stated yet, so I'll just throw this in:

    Don't confine yourself to a single row per user. You can save your rows like so:

    INT INDEX KEY id | VARCHAR(32) username | VARCHAR(32) yourformattedstring

    You can add a new data entry for every single block broken. You don't need to confine it, databases can hold millions of rows. In this, you have much more flexibility to search through their previous blocks:

    SELECT `yourformattedstring` FROM `yourtable` WHERE `username`='example'
    SELECT `yourformattedstring` FROM `yourtable` WHERE `username`='example' LIMIT pagenumber, amount

    If you want, you can even store a separate table of username with id values, and then use id's in your main block recording table (since grabbing a numeric value is faster than a string/varchar value).
     
    Scizzr likes this.
  12. Offline

    CubieX

    Using a concaternated string in a database is not a good idea for the most part.
    You cannot work with such a string in any way, without splitting it up and interpreting it beforehand.

    Question is: Do you just need a counter for every block type per player?
    Or do you want to log each block break individually like LogBlock does it?
    I guess the first one?

    If it's the first:
    I would suggest you make a Player table which holds player names and a "BrokenBlocks" table that holds each blocktype in a separate column and each row has a player ID and those block types that can be broken.
    So you can easily query each block type for each player, without having to postporccess your concaternated string first every time you need some information.
    However, this is only practical for a "total number of broken blocks of type xyz for player xyz" query.

    If you want to log every block break individually (and not only the number of broken blocks per type), you will have to add an entry per break.
    In this case, you would define a Player table and a Blocks table and a BrokenBlocks table,
    and each entry would only hold the unique index of the entry, the playerID and the broken blocks ID and whatever you may need additionaly (timestamp or location..)

    You current approach has the weakness, that you have to save the player name or id for every entry separately, instead
    of only having exactly 1 entry per player.
    This is not that good. It has something to do with database normalization.
     
    uyuyuy99, Scizzr and metalhedd like this.
  13. Offline

    metalhedd

    This is a very important point. Don't bother using a database if you're not going to use it as it was intended, a database is not a dumb datastore, its meant for relational data. you can't relate anything to a chunk of text that has no meaning.
     
    uyuyuy99 and Scizzr like this.
  14. Offline

    Scizzr

    I'm just storing the number of each type of block broken, not an in-depth log like LogBlock or HawkEye or BigBrother.

    I'm not storing just blocks broken. I'm storing that, distance traveled (jumped, walked, swam, etc), number of each type of mob killed, damage take/dealt, and LOTS of other things. My current enum has about 300 values.

    The concept is that it's for "achievements" for my server. Again, I've got it working fine and all, it's just a pain in the ass to check when someone breaks a block that the correct stat gets referenced. I could just do something like Block.getType().name() and then in my enum, make a getByName() method to return the right enum value.

    Edit: Just wanted to take a minute and thank everyone that replied. I've since decided to go the relative route, and let the database be unfriendly to admins. What I'll end up doing is implementing a tool they can use (command, web interface, etc). Also, I made a pretty nice query.
    Code:
    SELECT type_name, value FROM users_stats JOIN users_users ON users_users.user_id = users_stats.user_id WHERE users_users.username = 'UsernameGoesHere';
    
    It basically pulls all of their stats from the users_stats database after looking up their user id from the users_users table.
     
Thread Status:
Not open for further replies.

Share This Page