View Single Post
  #50  
Old 09-27-2019, 04:45 PM
Benanov Benanov is offline
Sarnak


Join Date: Jun 2019
Posts: 352
Default Technical explanation follows

Quote:
Originally Posted by Nargule [You must be logged in to view images. Log in or Register.]
I then downloaded Visual Studio Code and discovered the file got converted from UTF-8 to UTF-8 with BOM. I have no idea what BOM is, but converting it back to standard UTF-8 encoding fixed the issue and now I have my strings back, including the crafting success string!
BOM is the UTF "byte order mark" - this is used in other UTF versions (UTF-16, UTF-32) to tell programs reading this file which way the bytes go, as most files are read a byte at a time, but you need more than one byte to determine a character. Basically - is it BA or AB? ABCD or DCBA? That's what a BOM is for.

However, in UTF-8, this is pretty much pointless - because in UTF-8, the general size of a character is one byte, so there's no order required. It basically is only useful to say "HEY, THIS FILE IS UTF-8!" and is generally regarded as superfluous.

Older software, like EQ, will choke and die on the BOM. UTF text wasn't big in Windows at this time - most of the time you'd have files in one or more code-pages - Windows-1252 being the dominant one for English text in the US. (For giggles: ASCII, UTF-8 and Windows-1252 are generally seen as compatible, but certain characters will be all sorts of broken if you don't properly translate from one to the other. "Smart quotes" are the most common points of breakage.)

I'm not surprised EQ never accounted for people mucking with the interface text files. They're controlled by the server and would have been updated / checksummed by the patcher, so it doesn't surprise me that no engineering effort was expended in making sure EQ could handle a text file in a different format than what was expected.

If I had $10 for every time I had a BOM cause a bomb at my old job (I did a lot of syndication of news articles), I'd have eaten lunch for free most days.