Database model ZXDB

1171820222330

Comments

  • hikoki wrote: »
    @jonesypeter
    It would be nice to be able to enlarge images by pinching with your fingers
    ...
    Another idea...

    Thanks for the suggestions!

    The current layout very basic on purpose. We want to keep it simple and easy to maintain.
    As Peter mentioned, we are focusing on features and showing as much information as possible from on ZXDB.
    So all effort goes to that right now. In the future of course it will be better.

    About retrostyled sites: they only look good to those "in the know". Most people don't like them, they just think it's some old site from 1997 full of animated GIFs and Flash or something.
    SpectrumComputing is already too retro, with those big buttons and colours. It works because we have a spectrum photo on the loading page, and that creates visual unity.

    But again, keep those suggestions and critiques coming, they are welcome.






  • Do you mind if I import your already finished reviews list (Crash, etc) from ZXSR to ZXDB? Also we should consider storing links to ZXSR in ZXDB if that's useful...

    I think it would be very useful to store the links so I've created an Excel sheet that can be downloaded from my Dropbox account here.

    This is a list of all completed reviews with the ZXSR URLs as well as the Infoseek Id for cross-referencing and some other data for info purposes. The URL to the reviews will always begin:

    http://www.zxspectrumreviews.co.uk/review.aspx?

    So it's probably not worth storing that in ZXDB to save some space, entirely up to you though.

    You'll notice that there are some NULL Infoseek ID's. These are where I couldn't find a Infoseek entry so they're probably best ignored as they're never going to match to an Infoseek/ZXDB record. Unfortunately Excel has truncated all of the leading zeros on Infoseek ID, let me know if that's a problem and I'll fix it.

    With "Do you mind if I import your already finished reviews list (Crash, etc) from ZXSR to ZXDB" did you mean importing the full review text into ZXDB? It would be a great idea eventually but at the moment the data is stored in many tables (the idea being that review scores could be searched on, for example) so the entire review isn't stored in one big text field. It wouldn't take all that much effort to do it but it would be large. The entire ZXSR database is over 300Mb and is only going to grow!

    Actually, you have all the (completed review) URLs now so in theory you could write a screen-scrape routine and rip the HTML apart. Either that or I could get my backside in gear and create a REST service :-)

    Any probs or if you need anything else just give me a shout.
    Thanked by 1jonesypeter
  • Vampyre wrote: »
    Do you mind if I import your already finished reviews list (Crash, etc) from ZXSR to ZXDB? Also we should consider storing links to ZXSR in ZXDB if that's useful...

    I think it would be very useful to store the links so I've created an Excel sheet that can be downloaded from my Dropbox account here.

    Thank you! I will add all these links to ZXDB!

    Vampyre wrote: »
    This is a list of all completed reviews with the ZXSR URLs as well as the Infoseek Id for cross-referencing and some other data for info purposes. The URL to the reviews will always begin:

    http://www.zxspectrumreviews.co.uk/review.aspx?

    So it's probably not worth storing that in ZXDB to save some space, entirely up to you though.

    I will probably store the complete URLs anyway. Removing the prefix wouldn't save much space and I prefer to keep ZXDB content as straightforward as possible.

    Vampyre wrote: »
    You'll notice that there are some NULL Infoseek ID's. These are where I couldn't find a Infoseek entry so they're probably best ignored as they're never going to match to an Infoseek/ZXDB record.

    ZXDB is supposed to contain all Spectrum games. I have already added many titles missing from WoS and, if there's still any title left without an Infoseek ID, I will add them also.

    Let me check your spreadsheet, I will get back to you on this.

    Vampyre wrote: »
    Unfortunately Excel has truncated all of the leading zeros on Infoseek ID, let me know if that's a problem and I'll fix it.

    Not a problem at all.

    Vampyre wrote: »
    With "Do you mind if I import your already finished reviews list (Crash, etc) from ZXSR to ZXDB" did you mean importing the full review text into ZXDB? It would be a great idea eventually but at the moment the data is stored in many tables (the idea being that review scores could be searched on, for example) so the entire review isn't stored in one big text field. It wouldn't take all that much effort to do it but it would be large. The entire ZXSR database is over 300Mb and is only going to grow!

    No, I just want to import your review links and some additional information like magazine issue, page, etc. It wouldn't make sense to put your entire site content into ZXDB...

    Vampyre wrote: »
    Actually, you have all the (completed review) URLs now so in theory you could write a screen-scrape routine and rip the HTML apart. Either that or I could get my backside in gear and create a REST service :-)

    It probably won't be necessary.

    Vampyre wrote: »
    Any probs or if you need anything else just give me a shout.

    Thanks a lot!
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
    Thanked by 1jonesypeter
  • Thanks to Einar adding the Crash issue numbers to ZXDB the magazine links for Crash, along with those from Sinclair User, Your Spectrum and Your Sinclair now work.
  • Pavero wrote: »

    It's also missing IllogicAll.

    @hikok:i can you please add both to your list?
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • I'll add mods to my site as soon as possible. Cheers!
  • oofff there are quite a few mods.. what qualifies as a mod?
    ulaplus mods? like these ones: http://abrimaal.pro-e.pl/zx/ulaplus.htm
    coloured mods? like theses ones: http://www.zx-spectrum.cz/index.php?cat1=4&cat2=2&article_id=coloured.php&lang=en
    cracks? like these ones: https://speccy-live.untergrund.net/tag/cracks/
    pached games? like fixed versions (impossible mission) or poked versions enabling different keyboard controls (Penetrator)
  • hikoki wrote: »
    oofff there are quite a few mods.. what qualifies as a mod?
    ulaplus mods? like these ones: http://abrimaal.pro-e.pl/zx/ulaplus.htm
    coloured mods? like theses ones: http://www.zx-spectrum.cz/index.php?cat1=4&cat2=2&article_id=coloured.php&lang=en
    cracks? like these ones: https://speccy-live.untergrund.net/tag/cracks/
    pached games? like fixed versions (impossible mission) or poked versions enabling different keyboard controls (Penetrator)

    AFAIK this was Martijn's criteria:

    * MOD means extensively modifying the original game to either provide new gameplay (like different levels for JSW) or significantly change the game experience (like Ralf's changes from B&W to colored games).

    * CRACK means superficial changes like removing copy protection, adding infinite lives or trainers, porting from tape to disk (or vice-versa), adding an intro or modifying minor details like messages, color palette, etc.

    * BUGFIX means fixing serious bugs in a game, while leaving everything else as close to the original as possible.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • edited January 2017
    So there are three types of Modifications: Mod, Crack, Bugfix
    I have one column called 'MOD OF' so I dunno how to deal with the other two types.
    For now, I'm going to add a few mods
    Post edited by hikoki on
  • There's another ZXDB update here.

    This latest update includes ZXSR content, although temporarily stored in a separate table for now. I will gradually merge this new information into other ZXDB tables. I have also added new entries for products reviewed by magazines (Crash, YS, SU, etc) that were missing from WoS archive (mostly hardware devices for the ZX-Spectrum).

    Moreover, there's now a separate file called "ZXDB_aux.sql" that creates auxiliary tables to help improve database searches. For instance, right now if you search for "sokoban" at WoS archive you will find 15 titles, but searching for "sokoban" at Spectrum Computing will only find 8 titles. These auxiliary tables will solve this kind of problem. This is not part of the ZXDB directly, since I still believe it would be better to cache the entire database into memory and perform all searches locally... but since people prefer to use the database, I might as well help them! Whenever ZXDB is updated, it will be necessary to run this script again.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
    Thanked by 1jonesypeter
  • Vampyre wrote: »
    You'll notice that there are some NULL Infoseek ID's. These are where I couldn't find a Infoseek entry

    I added all these missing entries to ZXDB database. You can now use the new Infoseek IDs listed below:
    INFOSEEK       TITLE
    0029703        Nanodemo
    1000985        Atari Joysticks
    1001009        ZX Printer Spool
    1001010        Zipstick Super Pro
    1001011        Keyboard Stroke Kit
    1001012        Digital/Analogue Converter
    1001013        Fatherboard
    1001014        Computer Care Kit
    1001015        Tandy CGP-115
    1001016        Microline M80
    1001017        Brother EP44
    1001018        Star Micronix Gemini 10X
    1001019        Olivetti JP101
    1001020        Mannesmann Tally MT890
    1001021        Canon PW1080A
    1001022        Dyneer DW16
    1001023        Epson RX80
    1001024        Spectrum Console
    1001025        Flexible Edge Connector
    1001026        Easyload
    1001027        CEAF (Contrast Enhancement Antiglare Filter)
    1001028        Microdrive Cable
    1001029        RS232 Cable
    1001030        Kappa Keyboard Cartridge Box
    1001031        Screen Cycler
    1001032        Yewser Reset Switch
    1001033        Rainbow Electronics Reset Switch
    1001034        Computable
    1001035        101QL Monitor
    1001036        MVM12G Monitor
    1001037        Boxer 12 Monitor
    1001038        Disc Doubler
    1001039        Multitron Megamouse
    1001040        'T' Switcher
    1001041        SD-10 Printer
    1001043        Data-Binder
    1001044        Lawtronics RGB Monitor Interface
    2001486        Over The Rainbow
    

    Your spreadsheet also contained obsolete Infoseek IDs, that were not valid in WoS archive anymore. It usually happens when WoS finds out that a certain game is just a re-release of another game under a different title, in this case both entries are "merged" so the old Infoseek ID becomes invalid. Please update your Infoseek IDs for the following entries:
    INFOSEEK       TITLE
    0003955	       Riddle of the Sphinx (Longman Software)
    0004580	       Smudge and the Moonies (Nationsoft)
    0005182	       Ping Pong (Claus Jahn/Andy Schraepel)
    0027376	       Introduction to Expert Bridge (Bridgemaster)
    
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • "Einar wrote:
    ...I still believe it would be better to cache the entire database into memory and perform all searches locally... but since people prefer to use the database, I might as well help them!
    This sounds interesting though it may be difficult to implement or may increase data traffic (I recall Lee didn't seem very keen on this idea). It may help using client side caching or loading on demand when the user scrolls down to read more search results?
  • hikoki wrote: »
    This sounds interesting though it may be difficult to implement

    No, it's not. Accessing data already available in memory would make the implementation much simpler.

    hikoki wrote: »
    or may increase data traffic

    No, it doesn't. The data traffic between server and client will be exactly the same, regardless of how the server obtained this information.

    Also the data traffic between server and database would decrease a lot. The server would just need to read the database once every hour or so (to update the cache) instead of accessing it during every web page access. Moreover this cache update would be asynchronous, so every data would remain instantly available in memory even during cache refreshes.

    hikoki wrote: »
    (I recall Lee didn't seem very keen on this idea).

    Yes, but I recall Lee didn't give any reason against this idea.

    The only difference is, whenever moderators approved a submitted archive change, it could take like 1 hour for all users to see this change (unless they forced a cache update). So what? This is a small price to pay for a faster, more scalable web service.

    hikoki wrote: »
    It may help using client side caching or loading on demand when the user scrolls down to read more search results?

    These choices on way the client works have nothing to do with the way the server accesses its data! :)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • The database is on a separate server which has hundreds of other clients on it, accessing from several servers. Bearing in mind I've been doing this 20yrs, there is a massive list of clients in use.

    I'm not restructuring and therefore potentially having to adjust every clients site or service just for what is essentially a tiny website. Unless you wish to pay for a shed load of staff?

    New WoS has a local cache which serves it's purpose, and will probably move to an SSD cache this year. Either way, hits to the database are minimal anyway.
    My test signature
  • The database is on a separate server which has hundreds of other clients on it, accessing from several servers. Bearing in mind I've been doing this 20yrs, there is a massive list of clients in use.

    I'm not restructuring and therefore potentially having to adjust every clients site or service just for what is essentially a tiny website. Unless you wish to pay for a shed load of staff?

    New WoS has a local cache which serves it's purpose, and will probably move to an SSD cache this year. Either way, hits to the database are minimal anyway.

    Countdown to that appearing on twitter..... 5...4...3...
    My test signature
  • The database is on a separate server which has hundreds of other clients on it, accessing from several servers. Bearing in mind I've been doing this 20yrs, there is a massive list of clients in use.

    It shouldn't be a problem to implement a new system that only accesses the database to refresh its cache (instead of at every page) regardless of where the database is located... unless the new WoS will be also built as part of the same system you already have for other clients? If that's the case then OK, I can see the reason it would be hard to change it.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • Countdown to that appearing on twitter..... 5...4...3...

    Since you are worried about Twitter... I wrote my last reply as a single paragraph on purpose, so nobody will be able to quote it without including the last part where I agreed with you :)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • Countdown to that appearing on twitter..... 5...4...3...

    Since you are worried about Twitter... I wrote my last reply as a single paragraph on purpose, so nobody will be able to quote it without including the last part where I agreed with you :)

    Good effort but they screenshot instead lol.
    My test signature
  • There's a minor ZXDB update here.
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • hikoki wrote: »
    "Einar wrote:
    ...I still believe it would be better to cache the entire database into memory and perform all searches locally... but since people prefer to use the database, I might as well help them!
    This sounds interesting though it may be difficult to implement or may increase data traffic (I recall Lee didn't seem very keen on this idea). It may help using client side caching or loading on demand when the user scrolls down to read more search results?

    If you can write a better in-memory solution, then you're using a terrible RDBMS in the first place. I'm not sure why modern developers always assumes that they'll write something better than systems which have had decades of tuning on them.

    Also the WoS database is tiny, it is probably sat in memory cache on the DB server anyway and most accesses are going to be servable via a generic web cache.
  • AndyC wrote: »
    If you can write a better in-memory solution, then you're using a terrible RDBMS in the first place. I'm not sure why modern developers always assumes that they'll write something better than systems which have had decades of tuning on them.

    I think you may had misunderstood the concept. You don't need to move the RDBMS into memory. You can use it to feed a in memory mirror using native data structures from your language of choice.

    There're some nice set operations on Python, for example, that can be useful when you can afford to keep all of your data in memory. You can even (ab)use some low footprint appliances without decent I/O but lots of memory to do the job.

    I agree that it's not a good approach for the day-to-day database, but there're some niches where this can work better.
  • edited January 2017
    @AndyC
    I just thought Einar made an interesting point that might be being overlooked
    Post edited by hikoki on
  • Guys, there's no need to keep sending me PM's or FB messages! I know Paul Andrews & Co have edited the wikipedia template to point all software entries to Spectrum Computing instead of WoS because Peter told me first!
    My test signature
    Thanked by 1jonesypeter
  • JSilva wrote: »
    AndyC wrote: »
    If you can write a better in-memory solution, then you're using a terrible RDBMS in the first place. I'm not sure why modern developers always assumes that they'll write something better than systems which have had decades of tuning on them.

    I think you may had misunderstood the concept. You don't need to move the RDBMS into memory. You can use it to feed a in memory mirror using native data structures from your language of choice.

    There're some nice set operations on Python, for example, that can be useful when you can afford to keep all of your data in memory. You can even (ab)use some low footprint appliances without decent I/O but lots of memory to do the job.

    I agree that it's not a good approach for the day-to-day database, but there're some niches where this can work better.

    Exactly my point.

    ZXDB fits this niche because:

    * It's small enough to easily fit everything into memory.

    * It's almost read-only. Updates would probably happen just a few times per day upon administrator's approvals. Moreover, users wouldn't even know if they are seeing changes 1 second or 1 hour after an approval anyway.

    * Almost every archive search is a LIKE based on a partial program title or individual/company name. Therefore they cannot use a database index and require FULL TABLE SCAN. This is a very inefficient way to use a database, not much different from simply comparing entries one-by-one in memory.

    * Generating the complete webpage about a game requires combining data from about 50 tables, More than half of them are 1-N or N-N relations, so that's a fairly complex set of information to process. It requires over 20 database accesses for a single webpage, or a single SELECT combining about 50 tables using OUTER JOINs. Even if the database caches all these results internally, it won't avoid repeating all the hard work required to regroup and reorganize all this information into a structured webpage. It would be much easier to simply read the entire content of all tables one-by-one, build the entire data structure in memory, then simply traverse this already organized data when building each webpage.

    * Web server caching would work better than database caching anyway, since database caching would not avoid repeating all the work involved in extracting the information from SQL results and building a webpage again.

    Therefore this system is a perfect candidate for caching the entire database into a data structure in memory and refreshing it periodically.

    (Except only for the logic to manage change requests and approvals that require transaction control, but that's not part of ZXDB itself.)
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.

  • * It's small enough to easily fit everything into memory.

    It is small *at the moment*

    * It's almost read-only. Updates would probably happen just a few times per day upon administrator's approvals. Moreover, users wouldn't even know if they are seeing changes 1 second or 1 hour after an approval anyway.

    A few times a day? Martijn was working almost full time on updates...

    * Almost every archive search is a LIKE based on a partial program title or individual/company name. Therefore they cannot use a database index and require FULL TABLE SCAN. This is a very inefficient way to use a database, not much different from simply comparing entries one-by-one in memory.

    That is not the plan for the improved infoseek. Basic searches are using an index. The way I have set the index makes it much quicker to search larger amounts of data than the current infoseek.

    Advanced searches are mostly done on a table level, with some exceptions.

    * Generating the complete webpage about a game requires combining data from about 50 tables, More than half of them are 1-N or N-N relations, so that's a fairly complex set of information to process. It requires over 20 database accesses for a single webpage, or a single SELECT combining about 50 tables using OUTER JOINs. Even if the database caches all these results internally, it won't avoid repeating all the hard work required to regroup and reorganize all this information into a structured webpage. It would be much easier to simply read the entire content of all tables one-by-one, build the entire data structure in memory, then simply traverse this already organized data when building each webpage.

    This is true - which is why I have a cache on the server. The results are cached and used to load the page. Only the first visit to the page uses the DB - and that will mostly be done by the googlebot.
    My test signature
  • * Almost every archive search is a LIKE based on a partial program title or individual/company name. Therefore they cannot use a database index and require FULL TABLE SCAN. This is a very inefficient way to use a database, not much different from simply comparing entries one-by-one in memory.

    For which a much better solution is a full text search index. Either in DB or using an off-the-shelf tool (as WoS apparently already does). You can get all the benefits of much more heavily optimized and tested code with an almost certainly better feature set (accommodating misspellings, synonyms, relevance ranking etc.)
    * Generating the complete webpage about a game requires combining data from about 50 tables, More than half of them are 1-N or N-N relations, so that's a fairly complex set of information to process. It requires over 20 database accesses for a single webpage, or a single SELECT combining about 50 tables using OUTER JOINs. Even if the database caches all these results internally, it won't avoid repeating all the hard work required to regroup and reorganize all this information into a structured webpage. It would be much easier to simply read the entire content of all tables one-by-one, build the entire data structure in memory, then simply traverse this already organized data when building each webpage.

    For which the solution is Indexed Views/Materialised Views or an equivalent depending on your RDBMS. Or, at worse, a caching table periodically refreshed with a MERGE. Taking everything out of the database to accomplish the same thing is just enormously pointless and the assumption you'll do better than the RDBMS programmers at searching/sorting data is a bit naïve. I've seen so many web developers do this and end up with clunky, difficult to maintain solutions that don't end up performing any better.
    * Web server caching would work better than database caching anyway, since database caching would not avoid repeating all the work involved in extracting the information from SQL results and building a webpage again.

    Therefore this system is a perfect candidate for caching the entire database into a data structure in memory and refreshing it periodically.
    Huh? Your own argument is that caching the resultant web pages makes more sense than caching the database, so why are you still advocating caching the database manually. It doesn't make sense.
  • AndyC wrote: »

    as WoS apparently already does

    No it doesn't....
    My test signature
  • edited January 2017
    Guys, there's no need to keep sending me PM's or FB messages! I know Paul Andrews & Co have edited the wikipedia template to point all software entries to Spectrum Computing instead of WoS because Peter told me first!

    Also I have been talking with both Peter and Lee in private about this problem and what we must do about it.

    Although we have been strongly arguing different opinions about the archive, we are basically on the same side and working to achieve roughly the same goal :)
    Post edited by Einar Saukas on
    Creator of ZXDB, BIFROST/NIRVANA, ZX7/RCS, etc. I don't frequent this forum anymore, please look for me elsewhere.
  • Also I have been talking with both Peter and Lee in private about this problem and what we must do about it.

    Although we have been strongly arguing different opinions about the archive, we are basically on the same side and working to achieve roughly the same goal :)

    It's a fun project and I will support as much as I have - and if this or any project uses the WoS api or whatever we eventually agree for ZXDB - which we will one way at some point - all it proves is that idiots won't win. Peter and Einar are good guys doing a good thing - if it comes down to it, I am happy to host their work from my business account. We have "discussions" but neither Einar or Peter are doing anything WoS disagrees with or have argued against.
    My test signature
    Thanked by 1jonesypeter
Sign In or Register to comment.