The OriginTypes table has three codes (o,r & c) that are duplicated based on case. Looking at the data, there really doesn't seem to be any semantic difference between, say, c and C - so I'd think that could be easily fixed by either applying case insensitive behaviour or forcing the case of the code to upper/lower. The only linked table appears to be the downloads, so it's probably a reasonably straightforward improvement.
"Origintype" was part of Martijn's personal notation about the origin of each download file in the original WoS archive. It doesn't make sense to preserve this field in a new database so I will eventually remove it later, after I use it to double-check every download file was classified properly.
Perhaps Martijn had a reason for distinctions like "c" and "C", but I didn't figure it out yet. In ZXDB, I'm never discarding any information until I can be 100% sure it's unnecessary and redundant. In the meantime you can simply ignore it for now...
More puzzling is the Magrefs table, on which none of the three unique constraints works correctly. Looking at the data, I'm not sure the constraints actually make sense. There are some rows that vary by, for example, only the label_id and I'm not sure that's incorrect (but it breaks because then the uk_magrefs_topics and uk_magrefs_entry constraints are no longer unique). Should this actually be one constraint across all three values?
You must choose this constraint above or the original 3 separate constraints depending on how your RDBMS implements composite UNIQUE constraints (according to our previous discussion about this subject earlier in this thread).
So having got the database up and running, here are the first few identified rows that appear to contain duplicated data that breaks a unique constraint (uk_feature) in the Features table. It's at least partially due to case sensitivity, but even making the name column case sensitive won't allow it in SQL Server as some of the entries don't differ by case. I'm not yet sure why MySQL would allow them at all (spacing perhaps?)
These cases were allowed due to the difference on how MySQL handles composite UNIQUE constraints. I have now eliminated these cases in today's ZXDB update.
However there are still a few cases that differ by case sensitivity only. I'm concerned about merging these rows because in a few cases I checked, they refer to unrelated sections with the same name in different magazines... I will need more time to investigate it further. In the meantime, please use case sensitivity in this information for now.
They all seem like sound decisions, definitely worth keeping case sensitivity till we can be 100% sure they aren't supposed to be different.
As promised, here's a Powershell script which will convert the Full MySQL file into SQL Server compatible T-SQL, if anyone wants it. I've tried to avoid changing anything unless I absolutely had too. I do name Primary Keys, just because it's a pain to change them if needed when they're system named and I also rename Unique constraints because SQL Server requires them to be uniquely named objects in the database (I'm just prepending UQ_<tablename> so it should be easy enough to find the equivalent).
The script will create an Output subfolder, with Tables and Inserts subfolders beneath that. I'm currently writing out a single Table and Insert script per table, both because it makes debugging easier and also because SMSS chugs on my machine with huge T-SQL scripts! Two more scripts are created in the Output folder. The first, preamble.sql which just contains the lead lines that I ignore, I haven't bothered scripting the Create Database command as file paths, sizes etc are probably too machine specific. The second, fixups.sql contains all the foreign key constraints and any indexes, mostly so you don't have to worry about what order the table creation scripts are run in.
Columns will be case-insensitive if you use the SQL Server default collations, except for the columns that need insensitivity. Alternatively you can create a database with a case sensitive collation and it should still work throughout, if you prefer.
I haven't tried doing anything with the increment files. Having a look it would probably be easier to do things manually in that case or just recreate the whole thing from scratch, as remapping all the subtleties between SQL versions is a little more difficult.
$ErrorActionPreference = "Stop";
# Create output folders
if (!(Test-Path Output\Tables)) {mkdir Output\Tables}
if (!(Test-Path Output\Inserts)) {mkdir Output\Inserts}
$sourceFile = 'ZXDB_latest_mysql.sql'
$currentOutputFile = 'Output\preamble.sql'
$postfixFile = 'Output\fixups.sql'
@"
Use ZXDB
Go
"@ | Out-File $postfixFile -Encoding utf8
write-progress -id 1 -Activity 'Rewriting files' -status "loading source data" -PercentComplete -1
$source = Get-Content $sourceFile -Encoding utf8
$totalRows = $source.count
$sw = new-object System.IO.Streamwriter($currentOutputfile, $false, [System.Text.Encoding]::UTF8)
for ($rownum =0; $rownum -lt $totalRows; $rownum++)
{
$progress = ($rownum / $totalRows) * 100
write-progress -id 1 -Activity 'Rewriting files' -status "% comlete: $('{0:N2}' -f $progress)" -PercentComplete $progress
$text = $source[$rownum]
$text = $text -replace '`text`','[text]' # escape columns with reserved word names
$text = $text -Replace '`','' # Remove backticks
$text = $text -Replace "\\'","''" # Change escaped quotes
# Change file for table create
If ($text -match '^Create Table If Not Exists (\w+)') {
$tableName = $Matches[1]
$text = @"
Use ZXDB
Go
Drop Table If Exists $tableName;
Create Table $tableName (
"@
$currentOutputfile = "Output\Tables\Table_$($tableName).sql"
$sw.Close()
$sw = new-object System.IO.Streamwriter($currentOutputfile, $false, [System.Text.Encoding]::UTF8)
$rowlimit = 0 # Can have as many rows as we like in Create Table file
$processingTable = $true
}
# Change file for insert into
If ($text -match '^Insert Into (\w+)') {
$tableName = $Matches[1]
$count = 0
$lastfile = $currentOutputFile
$currentOutputfile = "Output\Inserts\Insert_$($tableName).sql"
# don't recreate the file if it's a continuation (because we just hit a secondary INSERT statement)
if ($lastfile -ne $currentOutputFile)
{
$sw.Close()
$sw = new-object System.IO.Streamwriter($currentOutputfile, $false, [System.Text.Encoding]::UTF8)
}
$rowlimit = 1000
$processingTable = $false #Avoid possibility of table fixups corrupting data
$idfix = @"
Use ZXDB
Go
If Exists(Select * From sys.columns Where is_identity = 1 And Object_Name(Object_Id) = '$TableName')
Begin
Set Identity_Insert $tableName On
End
`n
"@
$sw.WriteLine($idfix)
$insertPreamble = $idfix + $text
}
if ($processingTable) {
# Remove weird MySql column width sizes from int types and remove unsigned suffix since SQL Server doesn't have unsigned types
$text = $text -replace '(?<name>\w+)\s+(?<type>\w+)?int\((?<size>\w+)\)\s*(?<sign>unsigned)?\s+','${name} ${type}int '
# Rewrite mediumtext to varchar(max)
$text = $text -replace 'mediumtext','varchar(max)'
#Rewrite tinyint to smallint, since SQL Server doesn't allow negatives in tinyint columns
$text = $text -replace 'tinyint','smallint'
#Remove Default Null constraints, they don't make much sense
$text = $text -replace 'DEFAULT NULL',''
# Fix up identity values
$text = $text -replace 'ENGINE=InnoDB AUTO_INCREMENT=(?<autoincrement>\d+)? DEFAULT CHARSET=utf8 COLLATE=utf8_bin;',"`n dbcc checkident($tablename, RESEED, `${autoincrement})"
# For auto increment columns, make them identity columns (fix up values with dbcc later)
$text = $text -replace 'AUTO_INCREMENT','Identity(1,1)'
# Clean up mysql table options
$text = $text -replace 'ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin',''
#Rewrite Unique Constraints, make sure they're unique as MySql names aren't
$text = $text -replace 'UNIQUE KEY (\[)?(?<keyname>\w+)(\])? \((?<cols>[\[|\]|\w|\s|,]+)\)',"Constraint UQ_$($TableName)_`${keyname} Unique (`${cols})"
#Name primary keys, so it's easier to patch them up later if need be
$text = $text -replace 'PRIMARY KEY \((?<cols>[\[|\]|\w|\s|,]+)\)',"Constraint PK_$($tableName) Primary Key Clustered (`${cols})"
#Deal with inline indexes by writing them to a Post Fix file
if ($text -match '^\s+KEY (?<keyname>\w+) \((?<cols>[\[|\]|\w|\s|,]+)\)')
{
"Create Nonclustered Index $($Matches['Keyname']) On $($TableName)($($Matches['cols']))" | Out-File $postfixFile -Encoding utf8 -Append -NoClobber
$text = '--' + $text
}
# Move foreign key constraint into Post Fix file, so table order deployment doesn't have to matter
if ($text -match '^\s+Constraint\s+(?<name>\w+)\s+Foreign Key \((?<cols>[\[|\]|\w|\s|,]+)\) References (?<foreignTable>\w+) \((?<foreignCols>[\[|\]|\w|\s|,]+)\)')
{
"Alter Table $TableName Add Constraint $($Matches['Name']) Foreign Key ($($Matches['cols'])) References $($Matches['foreignTable']) ($($Matches['foreignCols']))" |
Out-File $postfixFile -Encoding utf8 -Append -NoClobber
$text = '--' + $text
}
# Fix collations
$text = $text -replace 'COLLATE utf8_bin','COLLATE DATABASE_DEFAULT'
# Change columns that currently need case sensitive behaviour
if (($tableName -eq 'aliases' -and $text -match '^\s*title')`
-or ($tableName -eq 'availabletypes' -and $text -match '^\s*id')`
-or ($tableName -eq 'labels' -and $text -match '^\s*name')`
-or ($tableName -eq 'magazines' -and $text -match '^\s*name')`
-or ($tableName -eq 'origintypes' -and $text -match '^\s*id')`
-or ($tableName -eq 'origintypes' -and $text -match '^\s*\[text\]')`
-or ($tableName -eq 'tools' -and $text -match '^\s*title')`
-or ($tableName -eq 'downloads' -and $text -match '^\s*origintype_id')`
-or ($tableName -eq 'entries' -and $text -match '^\s*availabletype_id')`
-or ($tableName -eq 'features' -and $text -match '^\s*name')`
)
{
$text = $text -replace 'DATABASE_DEFAULT','Latin1_General_CS_AS'
}
# Remove unique constraints on tables that seem to have non-unique data
if (($tableName -eq 'magrefs' -and $text -match '^\s*Constraint UQ_magrefs_uk_magref_entry')`
-or ($tableName -eq 'magrefs' -and $text -match '^\s*Constraint UQ_magrefs_uk_magref_label')`
-or ($tableName -eq 'magrefs' -and $text -match '^\s*Constraint UQ_magrefs_uk_magref_topic')`
#-or ($tableName -eq 'features' -and $text -match '^\s*Constraint UQ_features_uk_feature')`
)
{
$text = '--' + $text + ' -- broken constraint'
}
} else {
$count++
# Remove trailing comma from last row of insert
if ($count -eq $rowlimit)
{
if ($text -match '(?<line>\s*\(.*\)),')
{
$text = $Matches['line'] + '; -- reached insert limit'
}
}
# Can only insert 1000 values per insert statement, so adjust to fit
if ($rowlimit -ne 0) {
if ($count -gt $rowlimit) {
$count = 1
$sw.WriteLine("Go`n")
$sw.WriteLine($insertPreamble)
}
}
}
$sw.WriteLine($text)
}
$sw.Close()
I might also have a stab at doing a convertor to SQLite if anyone is interested, as its more likely to be useful for anyone wanting to use ZXDB inside an app. And I could use an excuse to practice some Python. Unlikely to get around to it this weekend though.
As promised, here's a Powershell script which will convert the Full MySQL file into SQL Server compatible T-SQL
This is awesome, thank you!!!
Would you like me to upload your script together with the other ZXDB files? If so, please let me know whenever you update it and I will upload it again.
I haven't tried doing anything with the increment files. Having a look it would probably be easier to do things manually in that case or just recreate the whole thing from scratch, as remapping all the subtleties between SQL versions is a little more difficult.
Agreed, there's no need to do anything with the incremental files. I keep them mostly so anyone interested in validating my work (including myself) can "audit" everything I have done in ZXDB. Everybody else should just worry about the latest version.
In the meantime... ZXDB didn't have the lists of programs contained in each electronic magazine because a file was missing from the package of WoS archive files I got. I have fixed this problem now, the updated ZXDB version is here. Thanks a lot to Thomas for finding this problem and Gerard for providing the missing file!
I might also have a stab at doing a convertor to SQLite if anyone is interested, as its more likely to be useful for anyone wanting to use ZXDB inside an app. And I could use an excuse to practice some Python. Unlikely to get around to it this weekend though.
Yes please!! I'd be very interested to use ZXDB's data set in the ZXSR site rather than what it's using at the mo (Infoseek data forked into a custom Access DB, then converted to SQLite).
I might also have a stab at doing a convertor to SQLite if anyone is interested, as its more likely to be useful for anyone wanting to use ZXDB inside an app. And I could use an excuse to practice some Python. Unlikely to get around to it this weekend though.
Yes please!! I'd be very interested to use ZXDB's data set in the ZXSR site rather than what it's using at the mo (Infoseek data forked into a custom Access DB, then converted to SQLite).
Right now, all your ZXSR webpage links are already stored in ZXDB. Did you see my post about it?
From now on, what would be the best way to keep ZXSR and ZXDB in sync? For instance, you could periodically send me a spreadsheet just like you gave me before (either with new entries only, or containing all entries again) and I will merge this information into ZXDB every time. It's an easy enough solution until someone automates this process...
Apologies, no I didn't see that post, but that's cool you've added the links to ZXDB.
I'll put something directly on the ZXSR website itself that creates a downloadable CSV file that anyone can grab. Leave it with me and I'll let everyone know on this thread when it's ready.
Apologies, no I didn't see that post, but that's cool you've added the links to ZXDB.
I'll put something directly on the ZXSR website itself that creates a downloadable CSV file that anyone can grab. Leave it with me and I'll let everyone know on this thread when it's ready.
Another ZXDB version is available here. I finally finished merging all magazine references from SPOT/SPEX into ZXDB! After eliminating all duplicates, ZXDB now contains 168450 magazine references to programs, books, computers and peripherals (about 1/4 of them were never displayed in WoS webpages). Hurray!
There's still more work to be done. For instance, SPOT/SPEX contains information about game authors and publishers missing or differing from WoS archive, and some other useful information that can be useful, so I will merge them too...
I didn't change anything in your file, but next time you update it, I suggest you add a comment with your name and/or contact information at the top of the file. Also notice that you could simplify your script by reading from "ZXDB_latest_generic.sql" instead of "ZXDB_latest_mysql.sql" since it's exactly the same content, except without MySQL directives, and foreign key declarations moved to the end of file.
Ah, I may have to have a look at that then. I think I assumed it had them all stripped out and it seemed useful to be able to keep things like indexes and auto-increment values.
In the mean time, a python script for converting the file to SQLite:
import re
import codecs
linecount=0
inHeader = 1
outFile = codecs.open(r"sqlite.sql",'w','utf-8')
with codecs.open(r"ZXDB_latest_mysql.sql",'r','utf-8') as f:
for line in f:
linecount = linecount + 1
if inHeader:
if line.startswith('USE'):
inHeader = 0
else:
# strip out backticks
line = line.replace('`','')
# convert escaped single quotes
line = line.replace(r"\'","''")
# change collation
line = line.replace('utf8_bin','rtrim')
# get rid of auto-increment, as SQLite doesn't support it well enough
line = line.replace('AUTO_INCREMENT','')
# tidy up MySql-isms from end of table definitions
if line.startswith(') ENGINE'):
line =')'
# SQLite doesn't like unsigned, so manually convert to an integer
line = re.sub('int\s*\(\d+\)\s+unsigned','INTEGER',line, flags=re.I)
# strip out indexes
if re.match('\s*KEY',line,re.I):
line = ''
# fix unique constraint syntax and remove names
line = re.sub('\s*UNIQUE KEY \w*\s*\(','UNIQUE (',line)
outFile.write(line)
if linecount % 10000 == 0:
print(linecount)
outFile.close()
Hey guys. I've completed ZXSR's Review URL download page that can be accessed here. Simply accessing the page will automatically download a quote-surrounded CSV file format. It's also dynamically created so any new review entries will automatically be added. Whenever I upload any reviews I'll always notify in this thread and ZXSR's dedicated thread here.
Hey guys. I've completed ZXSR's Review URL download page that can be accessed here. Simply accessing the page will automatically download a quote-surrounded CSV file format. It's also dynamically created so any new review entries will automatically be added. Whenever I upload any reviews I'll always notify in this thread and ZXSR's dedicated thread here.
Thank you! From time to time, I will download and merge ZXDB again with latest ZXSR updates.
This time, I have made hundreds of improvements in alternate titles and comments. Also ZXDB now stores information about the original ZX-Spectrum VEGA including the complete list of 1,000 bundled titles, all of them with the exactly same game title and order they appear in the VEGA menu. The idea is, if you have a VEGA and see an interesting Speccy game in ZXDB, it will be easier to find this game inside VEGA menu to play it. Also if you are interested for further details about a game you played with VEGA, it will be easier to find this game in ZXDB.
I'm planning to do the same for VEGA+ titles when it's launched. :)
Here's a query to obtain the list of all games inside a compilation:
select c.tape_seq, c.tape_side, c.prog_seq, e.title, c.alias, c.variationtype_id, e.id
from entries e inner join compilations c on c.entry_id = e.id
where c.compilation_id = ?
order by c.tape_seq, c.tape_side, c.prog_seq;
For instance, running this query for value 11265 will list 4 games from this compilation. And running this query for value 1001062 will give you all 1,000 games from the original VEGA...
There's another ZXDB update here with improved descriptions for several computer models. These descriptions can be seen here.
If anyone have suggestions to improve these computer descriptions, or could provide short descriptions for any other computer missing from this list, please let me know!
As I said on 27th Feb I'd mention in here should I add a bulk of reviews. Sinclair User issues 77 to 101 now available (so only a couple of dozen or so left :-) ). The Export page will download all the references.
ZXDB has been updated again with lots of changes! The latest version is available here.
The conversion work from WoS archive and SPOT/SPEX is almost finished. There are now only 3 tables left from SPOT/SPEX (containing mostly conflicting information such as different game authors from WoS archive) and 2 tables left from "lost" files inside WoS archive (like ZX81 games). Everything else has been fully integrated into ZXDB, fortunately!
Technically, sites hosting ZXDB must be aware that field "filetype_id" in table "downloads" has changed. In previous versions, this field had different values for each file type and file format (for instance different values for "Loading screen" and "Loading screen dump"). Now this field will contain a unique value for each file type (i.e. a single "Loading screen"), and new field formattype_id will distinguish file format (52 for "Screen dump" or 53 for "Picture", for instance). Makes sense?
Comments
"Origintype" was part of Martijn's personal notation about the origin of each download file in the original WoS archive. It doesn't make sense to preserve this field in a new database so I will eventually remove it later, after I use it to double-check every download file was classified properly.
Perhaps Martijn had a reason for distinctions like "c" and "C", but I didn't figure it out yet. In ZXDB, I'm never discarding any information until I can be 100% sure it's unnecessary and redundant. In the meantime you can simply ignore it for now...
Yes:
You must choose this constraint above or the original 3 separate constraints depending on how your RDBMS implements composite UNIQUE constraints (according to our previous discussion about this subject earlier in this thread).
These cases were allowed due to the difference on how MySQL handles composite UNIQUE constraints. I have now eliminated these cases in today's ZXDB update.
However there are still a few cases that differ by case sensitivity only. I'm concerned about merging these rows because in a few cases I checked, they refer to unrelated sections with the same name in different magazines... I will need more time to investigate it further. In the meantime, please use case sensitivity in this information for now.
As promised, here's a Powershell script which will convert the Full MySQL file into SQL Server compatible T-SQL, if anyone wants it. I've tried to avoid changing anything unless I absolutely had too. I do name Primary Keys, just because it's a pain to change them if needed when they're system named and I also rename Unique constraints because SQL Server requires them to be uniquely named objects in the database (I'm just prepending UQ_<tablename> so it should be easy enough to find the equivalent).
The script will create an Output subfolder, with Tables and Inserts subfolders beneath that. I'm currently writing out a single Table and Insert script per table, both because it makes debugging easier and also because SMSS chugs on my machine with huge T-SQL scripts! Two more scripts are created in the Output folder. The first, preamble.sql which just contains the lead lines that I ignore, I haven't bothered scripting the Create Database command as file paths, sizes etc are probably too machine specific. The second, fixups.sql contains all the foreign key constraints and any indexes, mostly so you don't have to worry about what order the table creation scripts are run in.
Columns will be case-insensitive if you use the SQL Server default collations, except for the columns that need insensitivity. Alternatively you can create a database with a case sensitive collation and it should still work throughout, if you prefer.
I haven't tried doing anything with the increment files. Having a look it would probably be easier to do things manually in that case or just recreate the whole thing from scratch, as remapping all the subtleties between SQL versions is a little more difficult.
$ErrorActionPreference = "Stop"; # Create output folders if (!(Test-Path Output\Tables)) {mkdir Output\Tables} if (!(Test-Path Output\Inserts)) {mkdir Output\Inserts} $sourceFile = 'ZXDB_latest_mysql.sql' $currentOutputFile = 'Output\preamble.sql' $postfixFile = 'Output\fixups.sql' @" Use ZXDB Go "@ | Out-File $postfixFile -Encoding utf8 write-progress -id 1 -Activity 'Rewriting files' -status "loading source data" -PercentComplete -1 $source = Get-Content $sourceFile -Encoding utf8 $totalRows = $source.count $sw = new-object System.IO.Streamwriter($currentOutputfile, $false, [System.Text.Encoding]::UTF8) for ($rownum =0; $rownum -lt $totalRows; $rownum++) { $progress = ($rownum / $totalRows) * 100 write-progress -id 1 -Activity 'Rewriting files' -status "% comlete: $('{0:N2}' -f $progress)" -PercentComplete $progress $text = $source[$rownum] $text = $text -replace '`text`','[text]' # escape columns with reserved word names $text = $text -Replace '`','' # Remove backticks $text = $text -Replace "\\'","''" # Change escaped quotes # Change file for table create If ($text -match '^Create Table If Not Exists (\w+)') { $tableName = $Matches[1] $text = @" Use ZXDB Go Drop Table If Exists $tableName; Create Table $tableName ( "@ $currentOutputfile = "Output\Tables\Table_$($tableName).sql" $sw.Close() $sw = new-object System.IO.Streamwriter($currentOutputfile, $false, [System.Text.Encoding]::UTF8) $rowlimit = 0 # Can have as many rows as we like in Create Table file $processingTable = $true } # Change file for insert into If ($text -match '^Insert Into (\w+)') { $tableName = $Matches[1] $count = 0 $lastfile = $currentOutputFile $currentOutputfile = "Output\Inserts\Insert_$($tableName).sql" # don't recreate the file if it's a continuation (because we just hit a secondary INSERT statement) if ($lastfile -ne $currentOutputFile) { $sw.Close() $sw = new-object System.IO.Streamwriter($currentOutputfile, $false, [System.Text.Encoding]::UTF8) } $rowlimit = 1000 $processingTable = $false #Avoid possibility of table fixups corrupting data $idfix = @" Use ZXDB Go If Exists(Select * From sys.columns Where is_identity = 1 And Object_Name(Object_Id) = '$TableName') Begin Set Identity_Insert $tableName On End `n "@ $sw.WriteLine($idfix) $insertPreamble = $idfix + $text } if ($processingTable) { # Remove weird MySql column width sizes from int types and remove unsigned suffix since SQL Server doesn't have unsigned types $text = $text -replace '(?<name>\w+)\s+(?<type>\w+)?int\((?<size>\w+)\)\s*(?<sign>unsigned)?\s+','${name} ${type}int ' # Rewrite mediumtext to varchar(max) $text = $text -replace 'mediumtext','varchar(max)' #Rewrite tinyint to smallint, since SQL Server doesn't allow negatives in tinyint columns $text = $text -replace 'tinyint','smallint' #Remove Default Null constraints, they don't make much sense $text = $text -replace 'DEFAULT NULL','' # Fix up identity values $text = $text -replace 'ENGINE=InnoDB AUTO_INCREMENT=(?<autoincrement>\d+)? DEFAULT CHARSET=utf8 COLLATE=utf8_bin;',"`n dbcc checkident($tablename, RESEED, `${autoincrement})" # For auto increment columns, make them identity columns (fix up values with dbcc later) $text = $text -replace 'AUTO_INCREMENT','Identity(1,1)' # Clean up mysql table options $text = $text -replace 'ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin','' #Rewrite Unique Constraints, make sure they're unique as MySql names aren't $text = $text -replace 'UNIQUE KEY (\[)?(?<keyname>\w+)(\])? \((?<cols>[\[|\]|\w|\s|,]+)\)',"Constraint UQ_$($TableName)_`${keyname} Unique (`${cols})" #Name primary keys, so it's easier to patch them up later if need be $text = $text -replace 'PRIMARY KEY \((?<cols>[\[|\]|\w|\s|,]+)\)',"Constraint PK_$($tableName) Primary Key Clustered (`${cols})" #Deal with inline indexes by writing them to a Post Fix file if ($text -match '^\s+KEY (?<keyname>\w+) \((?<cols>[\[|\]|\w|\s|,]+)\)') { "Create Nonclustered Index $($Matches['Keyname']) On $($TableName)($($Matches['cols']))" | Out-File $postfixFile -Encoding utf8 -Append -NoClobber $text = '--' + $text } # Move foreign key constraint into Post Fix file, so table order deployment doesn't have to matter if ($text -match '^\s+Constraint\s+(?<name>\w+)\s+Foreign Key \((?<cols>[\[|\]|\w|\s|,]+)\) References (?<foreignTable>\w+) \((?<foreignCols>[\[|\]|\w|\s|,]+)\)') { "Alter Table $TableName Add Constraint $($Matches['Name']) Foreign Key ($($Matches['cols'])) References $($Matches['foreignTable']) ($($Matches['foreignCols']))" | Out-File $postfixFile -Encoding utf8 -Append -NoClobber $text = '--' + $text } # Fix collations $text = $text -replace 'COLLATE utf8_bin','COLLATE DATABASE_DEFAULT' # Change columns that currently need case sensitive behaviour if (($tableName -eq 'aliases' -and $text -match '^\s*title')` -or ($tableName -eq 'availabletypes' -and $text -match '^\s*id')` -or ($tableName -eq 'labels' -and $text -match '^\s*name')` -or ($tableName -eq 'magazines' -and $text -match '^\s*name')` -or ($tableName -eq 'origintypes' -and $text -match '^\s*id')` -or ($tableName -eq 'origintypes' -and $text -match '^\s*\[text\]')` -or ($tableName -eq 'tools' -and $text -match '^\s*title')` -or ($tableName -eq 'downloads' -and $text -match '^\s*origintype_id')` -or ($tableName -eq 'entries' -and $text -match '^\s*availabletype_id')` -or ($tableName -eq 'features' -and $text -match '^\s*name')` ) { $text = $text -replace 'DATABASE_DEFAULT','Latin1_General_CS_AS' } # Remove unique constraints on tables that seem to have non-unique data if (($tableName -eq 'magrefs' -and $text -match '^\s*Constraint UQ_magrefs_uk_magref_entry')` -or ($tableName -eq 'magrefs' -and $text -match '^\s*Constraint UQ_magrefs_uk_magref_label')` -or ($tableName -eq 'magrefs' -and $text -match '^\s*Constraint UQ_magrefs_uk_magref_topic')` #-or ($tableName -eq 'features' -and $text -match '^\s*Constraint UQ_features_uk_feature')` ) { $text = '--' + $text + ' -- broken constraint' } } else { $count++ # Remove trailing comma from last row of insert if ($count -eq $rowlimit) { if ($text -match '(?<line>\s*\(.*\)),') { $text = $Matches['line'] + '; -- reached insert limit' } } # Can only insert 1000 values per insert statement, so adjust to fit if ($rowlimit -ne 0) { if ($count -gt $rowlimit) { $count = 1 $sw.WriteLine("Go`n") $sw.WriteLine($insertPreamble) } } } $sw.WriteLine($text) } $sw.Close()This is awesome, thank you!!!
Would you like me to upload your script together with the other ZXDB files? If so, please let me know whenever you update it and I will upload it again.
Agreed, there's no need to do anything with the incremental files. I keep them mostly so anyone interested in validating my work (including myself) can "audit" everything I have done in ZXDB. Everybody else should just worry about the latest version.
I've tweaked it slightly to make things work with older versions of SQL Server, the latest version is at:
https://dl.dropboxusercontent.com/u/59925874/Rewrite ZXDB.ps1
Thanks again! I will upload it shortly.
Yes please!! I'd be very interested to use ZXDB's data set in the ZXSR site rather than what it's using at the mo (Infoseek data forked into a custom Access DB, then converted to SQLite).
Right now, all your ZXSR webpage links are already stored in ZXDB. Did you see my post about it?
https://www.worldofspectrum.org/forums/discussion/comment/892628/#Comment_892628
From now on, what would be the best way to keep ZXSR and ZXDB in sync? For instance, you could periodically send me a spreadsheet just like you gave me before (either with new entries only, or containing all entries again) and I will merge this information into ZXDB every time. It's an easy enough solution until someone automates this process...
I'll put something directly on the ZXSR website itself that creates a downloadable CSV file that anyone can grab. Leave it with me and I'll let everyone know on this thread when it's ready.
Thank you!
There's still more work to be done. For instance, SPOT/SPEX contains information about game authors and publishers missing or differing from WoS archive, and some other useful information that can be useful, so I will merge them too...
I uploaded your file here also. Thanks again!
I didn't change anything in your file, but next time you update it, I suggest you add a comment with your name and/or contact information at the top of the file. Also notice that you could simplify your script by reading from "ZXDB_latest_generic.sql" instead of "ZXDB_latest_mysql.sql" since it's exactly the same content, except without MySQL directives, and foreign key declarations moved to the end of file.
In the mean time, a python script for converting the file to SQLite:
import re import codecs linecount=0 inHeader = 1 outFile = codecs.open(r"sqlite.sql",'w','utf-8') with codecs.open(r"ZXDB_latest_mysql.sql",'r','utf-8') as f: for line in f: linecount = linecount + 1 if inHeader: if line.startswith('USE'): inHeader = 0 else: # strip out backticks line = line.replace('`','') # convert escaped single quotes line = line.replace(r"\'","''") # change collation line = line.replace('utf8_bin','rtrim') # get rid of auto-increment, as SQLite doesn't support it well enough line = line.replace('AUTO_INCREMENT','') # tidy up MySql-isms from end of table definitions if line.startswith(') ENGINE'): line =')' # SQLite doesn't like unsigned, so manually convert to an integer line = re.sub('int\s*\(\d+\)\s+unsigned','INTEGER',line, flags=re.I) # strip out indexes if re.match('\s*KEY',line,re.I): line = '' # fix unique constraint syntax and remove names line = re.sub('\s*UNIQUE KEY \w*\s*\(','UNIQUE (',line) outFile.write(line) if linecount % 10000 == 0: print(linecount) outFile.close()Thank you! I also uploaded this file here.
Thank you! From time to time, I will download and merge ZXDB again with latest ZXSR updates.
This time, I have made hundreds of improvements in alternate titles and comments. Also ZXDB now stores information about the original ZX-Spectrum VEGA including the complete list of 1,000 bundled titles, all of them with the exactly same game title and order they appear in the VEGA menu. The idea is, if you have a VEGA and see an interesting Speccy game in ZXDB, it will be easier to find this game inside VEGA menu to play it. Also if you are interested for further details about a game you played with VEGA, it will be easier to find this game in ZXDB.
I'm planning to do the same for VEGA+ titles when it's launched. :)
For instance, running this query for value 11265 will list 4 games from this compilation. And running this query for value 1001062 will give you all 1,000 games from the original VEGA...
If anyone have suggestions to improve these computer descriptions, or could provide short descriptions for any other computer missing from this list, please let me know!
The conversion work from WoS archive and SPOT/SPEX is almost finished. There are now only 3 tables left from SPOT/SPEX (containing mostly conflicting information such as different game authors from WoS archive) and 2 tables left from "lost" files inside WoS archive (like ZX81 games). Everything else has been fully integrated into ZXDB, fortunately!
Technically, sites hosting ZXDB must be aware that field "filetype_id" in table "downloads" has changed. In previous versions, this field had different values for each file type and file format (for instance different values for "Loading screen" and "Loading screen dump"). Now this field will contain a unique value for each file type (i.e. a single "Loading screen"), and new field formattype_id will distinguish file format (52 for "Screen dump" or 53 for "Picture", for instance). Makes sense?