4

I'm a software developer, not a mathematician. I saw a question about decoding ciphertext so I'm assuming this is not off-topic in this forum.

I have a legacy database that seems to be obfuscated, and the original program used to record this data is lost.

\begin{array} {l|l} \mathbf{PlainText} & \mathbf{Obfuscated} \\ \hline \mathtt{ACURA} & \mathtt{9A929BB646D5}\\ \mathtt{AGRALE} & \mathtt{9A9297AFBF5EEF}\\ \mathtt{AUDI} & \mathtt{9A92A9ABBB}\\ \mathtt{BENELLI} & \mathtt{9A9594A4A6A0A0BC}\\ \mathtt{BETA} & \mathtt{9A9594AEA0}\\ \mathtt{BULL} & \mathtt{9A95A4B64C}\\ \mathtt{CHEVROLET} & \mathtt{9A949291A8A8BB5DE066}\\ \mathtt{CHRYSLER} & \mathtt{9A9492A2B45AEA65F5}\\ \mathtt{CN AUTO} & \mathtt{9A94AC8A848B93B7}\\ \mathtt{DAFRA} & \mathtt{9A97969AA3B6}\\ \mathtt{DKW-VEMAG} & \mathtt{9A97AC42202425272A22}\\ \mathtt{EAGLE} & \mathtt{9A9699A6BD51}\\ \mathtt{ENGESA} & \mathtt{9A96AAB7B355DA}\\ \mathtt{FERRARI} & \mathtt{9AA9A0B448DB62FE}\\ \mathtt{FIAT} & \mathtt{9AA9BCBB5F}\\ \mathtt{GREEN} & \mathtt{9AA8B4BF4ACA}\\ \mathtt{HONDA} & \mathtt{9AABB445C65A}\\ \mathtt{HUMMER} & \mathtt{9AAB4FDA67FE1D}\\ \mathtt{HUSQVARNA} & \mathtt{9AAB4FE4799D92B142D0}\\ \mathtt{HYUNDAI} & \mathtt{9AAB4BE67AEC62FE}\\ \mathtt{JAGUAR} & \mathtt{9AADA0A1B9A8B6}\\ \mathtt{JINBEI} & \mathtt{9AADB841CC40C5}\\ \mathtt{KAWASAKI} & \mathtt{9AACA3BCB254D970FB}\\ \mathtt{LAMBORGHINI} & \mathtt{9AAFBE4AC341DF72F811061C}\\ \mathtt{MAN} & \mathtt{9AAEA1A9}\\ \mathtt{MCLAREN} & \mathtt{9AAEBF4AC445C640}\\ \mathtt{MINI} & \mathtt{9AAEB94ED8}\\ \mathtt{MV AGUSTA} & \mathtt{9AAE4B2D213AC342D445}\\ \mathtt{OLDSMOBILE} & \mathtt{9AA0A2A0BC58EB7B86819F}\\ \mathtt{PIAGGIO} & \mathtt{9AA3A2A5A3B84EC8}\\ \mathtt{PLYMOUTH} & \mathtt{9AA3A1BC45C650F178}\\ \mathtt{ROYAL ENFIELD} & \mathtt{9AA5BA52DC7BD748D448C85DFD07}\\ \mathtt{SATURN} & \mathtt{9AA4AB46D4758F}\\ \mathtt{SCANIA} & \mathtt{9AA4A9ACB5ACA1}\\ \mathtt{SCION} & \mathtt{9AA4A9B44BCB}\\ \mathtt{SEAT} & \mathtt{9AA4A7AE4C}\\ \mathtt{SHINERAY AUTO} & \mathtt{9AA4A2ADB4ABB1A74330282ED663}\\ \mathtt{SINO TRUK} & \mathtt{9AA4A3B748CE6D9798B0}\\ \mathtt{SUZUKI} & \mathtt{9AA4B754E660E5}\\ \mathtt{TRAXX} & \mathtt{9AA7B7BE58E2}\\ \mathtt{TRIUMPH} & \mathtt{9AA7B747D3738F82}\\ \mathtt{VOLARE} & \mathtt{9AB947D558F872}\\ \mathtt{VOLVO} & \mathtt{9AB947D563E0}\\ \mathtt{GASOLINA} & \mathtt{9AA8A7BC43DD6BEC6C}\\ \mathtt{ETANOL} & \mathtt{9A96A4A3BE5CE4}\\ \mathtt{GÁS} & \mathtt{9AA8B849}\\ \mathtt{DIESEL} & \mathtt{9A97AEB546D96A}\\ \mathtt{FLEX} & \mathtt{9AA9BB47D0}\\\hline \end{array}

If we assume the obfuscated text is encoding each character using 2 hexadecimal digits, the length is always the length of the plain text + 1.

The first byte is always 0x9A and the algorithm used is somewhat deterministic: for example,"ACURA", "AGRALE" and "AUDI" start with the same byte, so I guess 0x92 in the first position is "A". Looks like the subsequent characters are a function of the previous.

I would bet it is some combination of XOR and/or shifting with the previous value. I guess I could try to brute-force the data using frequency tables but I wonder if there is a smarter way to find the obfuscation function.

So the question is: given the hypothesis that character of index i in the cipher text is a simple function of character i-1, is it possible to derive the function?

[update]

SmileyCraft noticed it doesn't looks like a simple function of the previous character. In cipher text of DAFRA there is a 0x9A preceding the R but the result is not the same as the R in ROYALENFIELD.

[update 2]

The first byte is always fixed at 9A. Have you figured out the second byte based on the first char of the plain text yet? Somos

Well, no. I was pursuing a curious pattern in the bits but I don't know if it is an useful property. Looking at the bits, the least significant always alternate in a 010101 pattern, the second 00110011, the next 00001111 and so on:

\begin{array} {c|l|l} & \mathbf{PlainText} & \mathbf{Obfuscated} \\ \hline A & 01000001 & 10010010\\ B & 01000010 & 10010101\\ C & 01000011 & 10010100\\ D & 01000100 & 10010111\\ E & 01000101 & 10010110\\ F & 01000110 & 10101001\\ G & 01000111 & 10101000\\ H & 01001000 & 10101011\\ I & 01001001 & 10101010\\ J & 01001010 & 10101101\\ K & 01001011 & 10101100\\ L & 01001100 & 10101111\\ M & 01001101 & 10101110\\ N & 01001110 & 10100001\\ O & 01001111 & 10100000\\ P & 01010000 & 10100011\\ Q & 01010001 & 10100010\\ R & 01010010 & 10100101\\ S & 01010011 & 10100100\\ T & 01010100 & 10100111\\ U & 01010101 & 10100110\\ V & 01010110 & 10111001\\ W & 01010111 & 10111000\\ X & 01011000 & 10111011\\ Y & 01011001 & 10111010\\ Z & 01011010 & 10111101\\ \end{array}

All 141 pairs I have so far, guessed from associated images:

[('9A929BB646D5', 'ACURA'),
 ('9A9297AFBF5EEF', 'AGRALE'),
 ('9A9290909EEC7397A6B851', 'ALFA ROMEO'),
 ('9A9291F770E77AEA7F9386', 'AM GENERAL'),
 ('9A929194A1A2BCA8B9', 'AMAZONAS'),
 ('9A92ACB84DCB59CF', 'APRILIA'),
 ('9A92ABB2BC8E97B34ACA44C4', 'ASIA MOTORS'),
 ('9A92AB46DA7BD770F315302A2C', 'ASTON MARTIN'),
 ('9A92A9ABBB', 'AUDI'),
 ('9A9594A4A6A0A0BC', 'BENELLI'),
 ('9A9594A4B756D764', 'BENTLEY'),
 ('9A9594AEA0', 'BETA'),
 ('9A95909BA5ABA0', 'BIMOTA'),
 ('9A95AC42', 'BMW'),
 ('9A95A4AFB453', 'BUELL'),
 ('9A95A4ADA1A7B754', 'BUGATTI'),
 ('9A95A4ABA1BE', 'BUICK'),
 ('9A95A4B64C', 'BULL'),
 ('9A949B99ADABBBA9AE', 'CADILLAC'),
 ('9A949BA4A2AAA7', 'CAGIVA'),
 ('9A9498AA', 'CBT'),
 ('9A949295ACA1AEA9', 'CHANGAN'),
 ('9A949291AC54', 'CHERY'),
 ('9A949291A8A8BB5DE066', 'CHEVROLET'),
 ('9A9492A2B45AEA65F5', 'CHRYSLER'),
 ('9A9493A1BC5E66E1', 'CITROEN'),
 ('9A94AC8A848B93B7', 'CN AUTO'),
 ('9A94A8B14ACFA3BAB95BC65FE5', 'CROSS LANDER'),
 ('9A97969FA7BA', 'DACIA'),
 ('9A97969DA6BD47', 'DAELIM'),
 ('9A97969DBB59E4', 'DAEWOO'),
 ('9A97969A', 'DAF'),
 ('9A97969AA3B6', 'DAFRA'),
 ('9A979699AEBD5EE478', 'DAIHATSU'),
 ('9A9796AC4FF609', 'DATSUN'),
 ('9A97AC42202425272A22', 'DKW-VEMAG'),
 ('9A97A8AABE56', 'DODGE'),
 ('9A97A2A3AB52D7', 'DUCATI'),
 ('9A9699A6BD51', 'EAGLE'),
 ('9A9696AE', 'EBR'),
 ('9A96929E90', 'EFFA'),
 ('9A96AC8A8194A4B0BE58C4', 'EL DETALLE'),
 ('9A96AAB7B355DA', 'ENGESA'),
 ('9AA9A0A8A2BD4F', 'F.N.M.'),
 ('9AA9A4B446DD6D', 'FANTIC'),
 ('9AA9A0B448DB62FE', 'FERRARI'),
 ('9AA9BCBB5F', 'FIAT'),
 ('9AA9B64FDC', 'FORD'),
 ('9AA9B64DD370', 'FOTON'),
 ('9AA8A7BC9388858D', 'GAS GAS'),
 ('9AA8A3AEB55D', 'GEELY'),
 ('9AA8BF4AC041CE', 'GILERA'),
 ('9AA8BBB8', 'GMC'),
 ('9AA8B4BF4ACA', 'GREEN'),
 ('9AA8B340C85CE4', 'GURGEL'),
 ('9AABA2AEBC54', 'HAFEI'),
 ('9AABA2B2B1A4B18B8D9DAAA0B041D36A', 'HARLEY DAVIDSON'),
 ('9AABB445C65A', 'HONDA'),
 ('9AABB445C65A36D664EB62E6', 'HONDA MOTOS'),
 ('9AAB4FDA67FE1D', 'HUMMER'),
 ('9AAB4FE4799D92B142D0', 'HUSQVARNA'),
 ('9AAB4BDC7F86988A', 'HYOSUNG'),
 ('9AAB4BE67AEC62FE', 'HYUNDAI'),
 ('9AAAB6BC49D86B', 'INDIAN'),
 ('9AAAB6BA4BCB59F800', 'INFINITI'),
 ('9AAAB64DDD62FC6BFD143ADB49D0', 'INTERNATIONAL'),
 ('9AAAB244D8', 'IROS'),
 ('9AAAB34FE66E', 'ISUZU'),
 ('9AAA4FD259FA', 'IVECO'),
 ('9AADA0A5', 'JAC'),
 ('9AADA0A1B9A8B6', 'JAGUAR'),
 ('9AADBC44DB', 'JEEP'),
 ('9AADB841CC40C5', 'JINBEI'),
 ('9AADB34A', 'JPX'),
 ('9AACA3B0B657E667F2', 'KASINSKI'),
 ('9AACA3BCB254D970FB', 'KAWASAKI'),
 ('9AACBF43D545D2', 'KEEWAY'),
 ('9AACBBBA95B04CF5070A04', 'KIA MOTORS'),
 ('9AACB5BC44DF6BEB73E968', 'KOENIGSEGG'),
 ('9AAC4FDA', 'KTM'),
 ('9AAFBE45C9', 'LADA'),
 ('9AAFBE4AC341DF72F811061C', 'LAMBORGHINI'),
 ('9AAFBE4BC159D6', 'LANCIA'),
 ('9AAFBE4BC0B249CD66F812', 'LAND ROVER'),
 ('9AAFBA55E56B', 'LEXUS'),
 ('9AAFB6BAB451', 'LIFAN'),
 ('9AAFB643C94BDB7F', 'LINCOLN'),
 ('9AAFB043D779', 'LOTUS'),
 ('9AAEA1AFB754D472F1', 'MAHINDRA'),
 ('9AAEA1A9', 'MAN'),
 ('9AAEA1B2B859D67E85', 'MASERATI'),
 ('9AAEA1BD4DDC', 'MAZDA'),
 ('9AAEBF4AC445C640', 'MCLAREN'),
 ('9AAEBD56D64EDE7186E07AEC6F8C', 'MERCEDES BENZ'),
 ('9AAEBD56D67E9CA0', 'MERCURY'),
 ('9AAEB94ED8', 'MINI'),
 ('9AAEB948D478F6152A2133', 'MITSUBISHI'),
 ('9AAEB340C858EA', 'MORGAN'),
 ('9AAEB34ED2A0AB54EC1404', 'MOTO GUZZI'),
 ('9AAE4B2D213AC342D445', 'MV AGUSTA'),
 ('9AA1A4B14AD964', 'NISSAN'),
 ('9AA0A2A0BC58EB7B86819F', 'OLDSMOBILE'),
 ('9AA0BE42C1', 'OPEL'),
 ('9AA3A6BD4ADD61E0', 'PEUGEOT'),
 ('9AA3A2A5A3B84EC8', 'PIAGGIO'),
 ('9AA3A1BC45C650F178', 'PLYMOUTH'),
 ('9AA3BC4DEE6AE77E', 'PONTIAC'),
 ('9AA3BC49D345C152', 'PORSCHE'),
 ('9AA3B642CC', 'PUMA'),
 ('9AA5A4B4BA42C242', 'RENAULT'),
 ('9AA5BA41C244282F3CC653CB', 'ROLLS ROYCE'),
 ('9AA5BA57D374', 'ROVER'),
 ('9AA5BA52DC7BD748D448C85DFD07', 'ROYAL ENFIELD'),
 ('9AA4ABAAA3', 'SAAB'),
 ('9AA4ABBE5DE063FF', 'SAMSUNG'),
 ('9AA4AB46D4758F', 'SATURN'),
 ('9AA4A9ACB5ACA1', 'SCANIA'),
 ('9AA4A9B44BCB', 'SCION'),
 ('9AA4A7AE4C', 'SEAT'),
 ('9AA4A2ADB4ABB1A74330282ED663', 'SHINERAI AUTO'),
 ('9AA4A2ADB4ABB1A743302420202A3E', 'SHINERAY MOTOR'),
 ('9AA4BF47D679', 'SMART'),
 ('9AA4B9BC44D97F9BABA1', 'SSANGYONG'),
 ('9AA4B7BF4EF204', 'SUBARU'),
 ('9AA4B740CB49EC6E', 'SUNDOWN'),
 ('9AA4B754E660E5', 'SUZUKI'),
 ('9AA4B754E660E553E261EC6F96', 'SUZUKI MOTOR'),
 ('9AA7A6BCB2', 'TATA'),
 ('9AA8A7BF46C65C', 'TORBAL'),
 ('9AA7B854EC1318', 'TOYOTA'),
 ('9AA7B7BE58E2', 'TRAXX'),
 ('9AA7B747D3738F82', 'TRIUMPH'),
 ('9AA7B741C25DEE14', 'TROLLER'),
 ('9AA7B340', 'TVR'),
 ('9AB9B44CEB66EB6DFB', 'VAUXHALL'),
 ('9AB9B042DD4D', 'VESPA'),
 ('9AB947D558F872', 'VOLARE'),
 ('9AB947D56E93A6B2BB52F9', 'VOLKSWAGEN'),
 ('9AB947D563E0', 'VOLVO'),
 ('9AB8B740C870F9', 'WANGYE'),
 ('9ABAB545C940CD', 'YAMAHA')]
  • DAFRA gives 9A97969AA3B6 while ROYALENFIELD gives 9AA5BA52DC7BD748D448C85DFD07. With DAFRA you find 9A and then the letter R gives A3. With ROYALENFIELD you find 9A and then the letter R gives A5. Hence, it is not a function of the last character. – SmileyCraft Jan 01 '19 at 22:31
  • Sorry, I don't really know how to express this using math but I think DAFRA would be encoded like f("A", f("R", f("F", f("A", f("D", 0x9A))))), this is what I mean by "a function of the previous character". – Paulo Scardine Jan 01 '19 at 22:43
  • But DAFRA gives us f("R",0x9A)=A3 and ROYALENFIELD gives us f("R",0x9A)=A5. – SmileyCraft Jan 01 '19 at 22:44
  • Since every cipher text starts with 0x9A I assume it is the key. VOLVO and VOLARE both start with 0x9AB947D5 so I'm guessing f("V", 0x9A) = 0xB9, f("O", 0xB9) = 0x47 and so on... – Paulo Scardine Jan 01 '19 at 22:54
  • DAFRA gives f("D",0x9A)=0x97, f("A",0x97)=0x96, f("F",0x96)=0x9A, f("R",0x9A)=0xA3 and f("A",0xA3)=B6. However ROYALENFIELD gives f("R",0x9A)=0xA5. This is a problem since 0xA3 is not the same as 0xA5. I really don't see what you don't understand here. – SmileyCraft Jan 01 '19 at 23:00
  • Note that I'm guessing 0x9A is used only for the firs position, so the "R" in DAFRA would be encoded using another value (the cumulative result of the function applied to the preceding leters, "DAF"). – Paulo Scardine Jan 01 '19 at 23:02
  • look at the obfuscated DAFRA: 9A97969AA3B6 – SmileyCraft Jan 01 '19 at 23:03
  • Oh, I see, thanks. So there is a longer key or the result has something to do with the position of the character... – Paulo Scardine Jan 01 '19 at 23:06
  • The first byte is always fixed at $\texttt{9A}$. Have you figured out the second byte based on the first char of the plain text yet? – Somos Jan 02 '19 at 00:45
  • 1
    An $\mathtt{XOR}$ 1 has been used for the fist char. The 16 letters "F...U" have been ordered in the following way: $\mathtt{TUFGHI...RS}$; then, starting from the value $\mathtt{0xA7}$, $\mathtt{XOR}$ 1 has been applied. The same rule applies for the preceding and the following blocks (16 characters long?). – zar Jan 02 '19 at 10:51
  • The more plain vs obfuscated text pairs we have, the easier the job. Do you have any more such pairs? – Somos Jan 03 '19 at 03:35
  • So far I have 141 pairs, guessed from associated images. See my last update. – Paulo Scardine Jan 03 '19 at 05:43
  • (Have you tried to post this question to https://crypto.stackexchange.com/ ?) – zar Jan 03 '19 at 22:29
  • Oh, I see. Maybe yes, it could be off-topic. – zar Jan 04 '19 at 23:08

0 Answers0