# Yet Another Method to Uglify Important String Constants in JS

## 文章目录

This algorithm is designed & implemented by JieJiSS.

这篇博客应该会是全英文的（除了这几句话以外）……一方面是为了学校安排的几周后的“高中生人工智能教育”研讨会上的全英文 Presentation 做准备，一方面也是想体验一下某些站在鄙视链顶端的大牛们所提倡的写博客的方式。我承认中文互联网技术圈充满了劣质的资料，但是博客的水平难道不是由博主的技术水平决定的吗？跟写作所使用的自然语言的种类又有什么关系？

Anyway, 你说英文写博客好，那我就试试呗

## 0x00 Basic Variables & Helper Functions

Define

`raw`

: The variable`raw`

is used to represent the raw string constant which needs to be encoded. Beware that this isnotan encryption algorithm, which means that it is not cryptographically secure.

### 0x00.1 *Variable* `dataArr: number[]`

This is a pre-generated array which stores at least 9 (1+2+2*2+2) elements.

Define \(m\) as: \[ m = \max\,(2, \lceil\frac{\texttt{raw.length}}{7}\rceil) \]

Magic number 7: Every ASCII string that has no more than 7 characters can be encoded to a base64 string that has a maximum length of 10 characters (exclude trailing equal signs), which can be safely treated as a 36-decimal integer without any precision lost under the IEEE754 double data type. That's because in that case, `parseInt(encodedStr, 36)`

always returns an integer that is lower than `Number.MAX_SAFE_INTEGER`

, i.e. \(2^{53}-1=9007199254740991\), since `9007199254740991 .toString(36)`

returns `"2gosa7pa2gv"`

(which has 11 characters).

### 0x00.2 *Helper* `f(x)`

This helper function is used to calculate the "system" of the `x`

-th element of `dataArr`

. `f(x)`

could be any function which satisfied following criteria:

- \(f(0) \equiv 10 \mod 256\)
- \(f(1) \equiv 36 \mod 256\)
- \(f(2) \equiv 2 \mod 256\)
- \(f(3) \equiv 2 \mod 256\)
- \(\lfloor f(k) \rfloor =f(k) \quad(k\in\mathbb N^*)\)
- \(f(k)>0\quad (k=0,1,2,3)\)

Here are some available `f(x)`

s: \[
\begin{aligned}
f_1(x)&=-27x^3+179x^2-126x+10\\
f_2(x)&=101x^3-333x^2+258x+266
\end{aligned}
\]

1 | function f1(x) { |

**How to create a valid \(f(x)\)?**

Let \(f(x)=ax^3+bx^2+cx+d\).

According to the first criterion, we know \(f(0)=d \equiv 10 \mod 256\).

So, we can simply choose values that look good for \(d\): 10, 266, 522, …

For similar reasons, we know that \(f(1)-d=a+b+c\equiv 26\mod 256\), \(f(2)-d=8a+4b+2c\equiv 248\mod 256\), and \(f(3)-d=27a+9b+3c \equiv 248\mod 256\). Note that \(27a+9b+3c\equiv 0\mod 3\), thus in order to get integer values for \(a\), \(b\), and \(c\), we need to ensure that \(27a+9b+3c=248+(3k+1)\times256\). Detailed explanation is that \(248=3\times82+2\) and \(256=3\times85+1\), hence \(248+256+3k\times256 = 504+3k\times256\) can always be divided by 3. After these calculations, we can simply solve the ternary systems of equations and get a possible combination for \(a\), \(b\) and \(c\).

To quickly solve such systems of equations, you may need to know how to use the Gaussian Elimination.

Cited from https://yutsumura.com/solving-a-system-of-linear-equations-using-gaussian-elimination/: \[ \begin{align*} x+2y+3z &=4 \\ 5x+6y+7z &=8\\ 9x+10y+11z &=12 \end{align*} \] The three

elementary row operations on a matrixare defined as follows.

Interchanging two rows: \(R_i \leftrightarrow R_j\) interchanges rows \(i\) and \(j\).

Multiplying a row by a non-zero scalar (a number):

\(t R_i\) multiplies row ii by the non-zero scalar (number) \(t\).

Adding a multiple of one row to another row:

\(R_j+tR_i\) adds \(t\) times row \(i\) to row \(j\).

SolutionFirst, the augmented matrix of the system is: \[ A=\left[ \begin{array}{rrr|r} 1 & 2 & 3 & 4 \\ 5 & 6 & 7 & 8 \\ 9 & 10 & 11 & 12 \end{array} \right]. \] We apply elementary row operations as follows to reduce the system to row echelon form. \[ A \xrightarrow{R_3 -9R_1} \left[\begin{array}{rrr|r} 1 & 2 & 3 & 4 \\ 5 & 6 & 7 & 8 \\ 0 & -8 & -16 & -24 \end{array}\right] \xrightarrow{-\frac{1}{8}R_3} \left[\begin{array}{rrr|r} 1 & 2 & 3 & 4 \\ 5 & 6 & 7 & 8 \\ 0 & 1 & 2 & 3 \end{array}\right] \]

\[ \xrightarrow{R_2-5R_1} \left[\begin{array}{rrr|r} 1 & 2 & 3 & 4 \\ 0 & -4 & -8 & -12 \\ 0 & 1 & 2 & 3 \end{array}\right] \xrightarrow{-\frac{1}{4} R_2} \left[\begin{array}{rrr|r} 1 & 2 & 3 & 4 \\ 0 & 1 & 2 & 3 \\ 0 & 1 & 2 & 3 \end{array}\right] \]

\[ \xrightarrow{R_3-R_2} \left[\begin{array}{rrr|r} 1 & 2 & 3 & 4 \\ 0 & 1 & 2 & 3 \\ 0 & 0 & 0 & 0 \end{array}\right] \]

The last matrix is in row echelon form.

The corresponding system of linear equations of it is: \[ \begin{align*} x+2y+3z &=4\\ y+2z&=3 \\ 0z&=0 \end{align*} \]

### 0x00.3 *Helper* `b64(str)`

This helper function encodes `str`

to a modified-base64 string literal. Note that it should remove all trailing equal signs. Also, note that the whole `b64templ`

string except its last three characters is reversed to annoy browsers' built-in `atob`

base64 decode function.

1 | const revb64templ = "9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA+/="; |

### 0x00.4 *Helper* `deb64(str)`

Decode base64 encoded str.

1 | const revb64templ = "9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA+/="; |

### 0x00.5 *Helper* `rndChar()`

This helper function is used to generate a random character ranged in `a-zA-Z0-9`

.

1 | const availableChars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"; |

### 0x00.6 *Helper* `padLeft(str, len, ch)`

This helper function will repeatedly add `ch`

as left padding to `str`

until `str.length === len`

.

1 | function padLeft(str, len, ch) { |

## 0x01 Generator

For instance, we'd like to hide a string constant `"Hello, World!"`

from users.

**Step 1.** Slice this string constant into \(m\) pieces:

1 | const raw = "Hello, World!", m = Math.max(Math.ceil(raw.length / 7), 2); |

Now, we have an array `sliced: string[]`

which contains sliced parts of the string constant:

1 | console.log(sliced); // [ 'Hello, ', 'World!' ] |

**Step 2.** Record the letter cases of all characters in `sliced: string[]`

. Also, in this step we will need to encode the lowercase of each string element's of `sliced`

using the `b64()`

helper function, and store it in `partial: string[]`

.

1 | const cases = [], partial = []; |

In this case, `rb64encodedText`

= `["r3oRi3BR19", "oHALi3tc"][i]`

.

Now, `cases`

becomes `[76, 112]`

(`[0b1001100, 0b1110000]`

, was once `[0b0001001100, 0b01110000]`

), and `partial`

becomes`["r3ori3br19", "ohali3tc"]`

. You may have noticed that the `0`

s before the first `1`

won't affect the result, so we'll use the `padLeft`

helper function to retrieve those missing `0`

s later when we're attempting to decode it.

**Step 3.** Replace all `+/=`

in `partial[i]`

to random characters ranged in `a-zA-Z0-9`

, and store relative information in `special: number[]`

. This step makes sure that `parseInt(partial[i], 36)`

can always return a valid integer (stored inside `magicNums: number[]`

).

According to Base64 - Wikipedia, equal signs only appear at the end of a base64 encoded string, so there's no need to deal with it because we have trimmed trailing equal signs in former steps.

Note that on line 4, we give `chunks: string[]`

an initial value `["0", "0"]`

. That's because `parseInt("")`

returns `NaN`

, and in most cases `base64Text`

won't include `"+"`

or `"/"`

, so `chunks`

will likely to remain unmodified. Hence, we need to make sure `parseInt(chunks[i])`

will not return `NaN`

.

1 | const special = []; |

Now, `special`

becomes `[0,0,0,0]`

since there's no `+`

or `/`

presents in base64-encoded text, and `magicNums`

becomes `[2752522766809245, 1918386016464]`

.

**Step 4.** Concat `magicNums.length`

, `magicNums`

, `special`

and `cases`

together, and store it as `dataArr`

. Then, convert `dataArr`

to `data`

.

1 | const dataArr = [magicNums.length] |

I've recognized that symbols like `+`

and `/`

rarely appears in the base64 encode method, so there'll always be a huge amount of `0`

s in `special: number[]`

. Thus, if we convert all the `0`

s in `dataArr`

to `""`

, we can save quite a lot of bytes.

**Conclusion for Generator Part**

Summary all codes together:

1 | const revb64templ = "9876543210zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA+/="; |

And the input is encoded as:

1 | "8|16334679401c30|34d7ce9b829|||||51|77" |

## 0x02 Decoder

Assume that we have fetched the encoded data from our backend:

1 | { |

Our backend codes:

1

2

3

4

5

6

7

8 > // ...

>

> router.get("/api/encoded", ctx => {

> ctx.status = 200;

> ctx.type = ".json";

> ctx.body = JSON.stringify({encoded: encodeResult});

> });

>

Our frontend codes:

1

2

3

4

5

6

7 > fetch("/api/encoded").then(response => {

> if(!response.ok) return;

> const jsonData = response.json();

> const encoded = jsonData.encoded;

> // ...

> });

>

And now, we need to apply several operations to the `encoded`

variable in order to decode it.

First of all, we will convert `encoded: string`

to an array named `firstStepArr`

:

1 | const firstStepArr = encoded.split("|").map(str => parseInt(str || "0", 15)); |

And then, we'll try slice this array into 3 parts. Since the `encoded`

str contains three parts, `magicNums`

, `special`

and `cases`

, we can have a glance at previous codes:

1 | // previous code |

As we can see, let \(l=\texttt{sliced.length}\), then we have \(\texttt{partial.length}=l\) and \(\texttt{cases.length}=l\). Also, we can infer that \(\texttt{special.length}=2l\) and \(\texttt{magicNums.length}=l\). Thus, we can easily know that \(\texttt{firstStepArr.length}=4l+1\).

So, we can declare a `size: number`

variable as we will need to use it to slice the `firstStepArr`

and generate a `secondStepArr: number[][]`

later:

1 | const size = (firstStepArr.length - 1) / 4; |

In the third step, we try to convert numbers to decoded strings. The \(f(x)\) function mentioned above now shows its necessity:

1 | const thirdStepArr = [[], [], []]; |

Yep, we now have the raw text decoded as `thirdStepArr[0]`

!

**Conclusion for Decoder Part**

Final codes:

1 | const encoded = "8|16334679401c30|34d7ce9b829|||||51|77"; |

The decoded result:

1 | "Hello, World!" |

## 0x03 Use Without Decoder

Well, you can encode users' input and compare it with the already encoded text. However, you will have to modify `rndChar`

to achieve that, since in some rarely seen cases it will cause different encoded result for the same input text.

My suggestion is that you can use a controllable PRNG (Pseudo Random Number Generator) to replace `Math.random()`

, for example:

1 | const seed = 2997991993, modulo = 256; |

Don't forget to call `resetAll()`

to reset `exists`

to `[seed % modulo]`

and `index`

to `-1`

every time you want to encrypt a string!