AI & 2048: Learning the Evaluation Function with TensorFlow.js

20/11/2025 0 43 4 min read

In Part 1, we built a 2048 bot using expectimax plus a hand-crafted evaluation function. In Part 2, we let the bot self-tune that evaluation function using simple self-play hill-climbing. In this Part 3, we take the next step: we use TensorFlow.js to train a neural network that learns a value function V(board) directly in the browser, and we use that neural network to pick moves. Everything is done client-side with JavaScript and TensorFlow.js – no Python, no backend.

AI & 2048: Learning the Evaluation Function with TensorFlow.js

Table of Contents

1. Goal: learn a value function V(board)
2. Encoding the board as neural network input
3. Neural network architecture in TensorFlow.js
4. Where do training targets come from?
5. Using the neural network to pick moves
6. Live demo: 2048 + TensorFlow.js neural value function
2048 Neural Value Function Demo (TensorFlow.js)
NN training info
7. Conclusion and next steps

1. Goal: learn a value function V(board)

We want a function:

V(board) ≈ “how good this state is”

Then, at each move:

Simulate all 4 directions: Up, Right, Down, Left.
Apply the move, get the resulting board.
Use the neural network to predict V(board_next).
Pick the move with the highest predicted value.

In Parts 1–2, this value was computed by a linear heuristic.
Now, V is a non-linear function represented by a neural network trained from data.

2. Encoding the board as neural network input

The neural network works with numeric vectors, not 4×4 grids.
We encode the board into a vector of length 16:

Empty cell (0) → 0
Tile 2 → log2(2) = 1
Tile 4 → 2
Tile 8 → 3
… and so on

We also lightly scale it to keep values in a reasonable range:

function encodeBoard(board) {
  const arr = [];
  for (let r = 0; r < 4; r++) {
    for (let c = 0; c < 4; c++) {
      const v = board[r][c];
      if (v === 0) {
        arr.push(0);
      } else {
        arr.push(Math.log2(v) / 16); // simple scaling
      }
    }
  }
  return arr; // length = 16
}

This gives us a fixed-size numerical representation of the board that can be fed into a dense neural network.

3. Neural network architecture in TensorFlow.js

We use a simple fully-connected network:

Input layer: 16 units (the encoded board)
Hidden layer 1: 64 units, ReLU
Hidden layer 2: 64 units, ReLU
Output layer: 1 unit (scalar value of the board)

In TensorFlow.js:

function createModel() {
  const model = tf.sequential();
  model.add(tf.layers.dense({
    inputShape: [16],
    units: 64,
    activation: 'relu',
  }));
  model.add(tf.layers.dense({
    units: 64,
    activation: 'relu',
  }));
  model.add(tf.layers.dense({
    units: 1, // scalar value V(board)
  }));
  model.compile({
    optimizer: tf.train.adam(0.001),
    loss: 'meanSquaredError',
  });
  return model;
}

This is deliberately small and simple, so it can train quickly in the browser.

4. Where do training targets come from?

A neural network needs target values to learn from. For this part, we do not use “pure RL” yet.
Instead, we let our hand-crafted heuristic from Part 1–2 act as a teacher.

Training pipeline:

Generate many board states via random self-play (short rollouts).
For each board, compute a heuristic score: target = heuristicValue(board).
Train the neural network to predict that heuristic score from the encoded board.

This is supervised learning from a pseudo-expert.
Once the network can approximate the heuristic well, we can optionally keep training it, or later replace the targets with TD-learning updates.

5. Using the neural network to pick moves

Given a model and a board, the NN-based evaluation is:

function evaluateBoardNN(board, model) {
  return tf.tidy(() => {
    const input = tf.tensor2d([encodeBoard(board)], [1, 16]);
    const output = model.predict(input);
    const value = output.dataSync()[0];
    return value;
  });
}

To choose a move:

function chooseBestMoveWithNN(board, model) {
  const moves = [UP, RIGHT, DOWN, LEFT];
  let bestMove = null;
  let bestValue = -Infinity;

  for (const move of moves) {
    const result = moveBoard(board, move);
    if (!result.moved) continue;
    const value = evaluateBoardNN(result.board, model);
    if (value > bestValue) {
      bestValue = value;
      bestMove = move;
    }
  }

  return bestMove;
}

If the model is not trained yet, we can fall back to the classic heuristic so the bot still plays something reasonable.

6. Live demo: 2048 + TensorFlow.js neural value function

The demo below includes:

A 4×4 2048 board rendered in HTML.
Buttons to:
- Auto-play with NN – the bot uses the neural network to choose moves.
- Train 1 batch (NN) – generate training data and update the network once.
- Train 10 batches (NN) – same as above, repeated 10 times.
- Reset model – discard the current neural network and create a new one.
- Reset board – start a fresh 2048 game.
Live info:
- Whether the model is initialized.
- How many training batches have been run.
- The loss of the last training batch.

All training happens in the browser using TensorFlow.js loaded from a CDN.

2048 Neural Value Function Demo (TensorFlow.js)

A neural network learns V(board) from a heuristic and then uses it to select moves

Score: 0 | Max tile: 0 | Status:

NN training info

7. Conclusion and next steps

In this Part 3, we:

Encoded the 2048 board as a fixed-size numeric vector.
Built a neural network in TensorFlow.js to approximate a value function V(board).
Used the hand-crafted heuristic as a teacher to generate supervised training data.
Used the trained neural network to drive a 2048 bot directly in the browser.

From here, there are several natural upgrades:

Replace heuristic targets with TD-learning or Q-learning style updates.
Combine expectimax with the neural value function (NN as a learned evaluator instead of a manual heuristic).
Add replay buffers, learning rate schedules, and better exploration strategies.

But even with just these three parts, the “AI & 2048” series has gone from human heuristics → expectimax → self-play tuning → neural value functions in TensorFlow.js — enough to build a surprisingly smart 2048 bot entirely inside the browser.

Series: AI and the 2048 Game

Previous post:
Self-learning 2048 bot: tuning the evaluation function with self-play