Facing error performing merge sort on large array

Hi, I am performing merge sort in a nodejs/express app, that is running in ubuntu server 24. I am sending request in 10 sec. interval, that is after receiving response from the express app, I am waiting for 10 second before sending a new request. the length of the array ranges from 500000 to 10000000 with 50000 interval.
however, around 8000000 length, I am getting this error.

[608.959660] page table: failed to allocate memory

frame trace:
ffffc0000280fdf8:   ffffffffca13e054    (new_zeroed_pages + 0000000000000194/00000000000001ef)
ffffc0000280fe88:   ffffffffca13e9c8    (do_demand_page + 00000000000001b8/000000000000054a)
ffffc0000280fef8:   ffffffffca16f52f    (unix_fault_handler + 000000000000081f/0000000000000eec)
ffffc0000280ff98:   ffffffffca1a37fa    (common_handler + 00000000000002fa/000000000000053e)
ffffc0000280ffe8:   ffffffffca07d0eb
00000000ffe69068:   000000000126871f
00000000ffe690b8:   00000000012688a7
00000000ffe690f8:   00000000014e2166
00000000ffe69138:   000000000192def6
00000000ffe69160:   000000013858e436
00000000ffe691e8:   0000000138597df1
00000000ffe692c0:   0000000138599550
00000000ffe69358:   0000000138594a34
00000000ffe693e0:   000000013859b0f7
00000000ffe69470:   0000000138599374
00000000ffe694d0:   0000000138594a34
00000000ffe69558:   0000000138597349
00000000ffe695d8:   000000013859482d
00000000ffe69630:   0000000138595a26
00000000ffe696f0:   00000001385976e8
00000000ffe69758:   0000000138594a34
00000000ffe697e0:   000000013859bae8
00000000ffe69870:   00000001385973c1
00000000ffe698f8:   000000013859482d
00000000ffe69950:   0000000138595a26
00000000ffe69a10:   0000000138597842
00000000ffe69a80:   0000000138594a34
00000000ffe69b08:   000000013859bae8
00000000ffe69b98:   00000001385973c1
00000000ffe69c20:   000000013859482d
00000000ffe69c78:   0000000138595a26
00000000ffe69d38:   00000001385971b2

kernel load offset ffffffffc9e7c000

What can be the problem and is there any way to resolve this?

Do you have a code snippet that reproduces this? Have you tried bumping the memory?

Yes I do.

const mergeSort = (arr) => {
  if (arr.length <= 1) return arr;
  const middle = Math.floor(arr.length / 2);
  const left = arr.slice(0, middle);
  const right = arr.slice(middle);
  return merge(mergeSort(left), mergeSort(right));
};

const merge = (left, right) => {
  let resultArray = [], leftIndex = 0, rightIndex = 0;

  while (leftIndex < left.length && rightIndex < right.length) {
    if (left[leftIndex] < right[rightIndex]) {
      resultArray.push(left[leftIndex]);
      leftIndex++;
    } else {
      resultArray.push(right[rightIndex]);
      rightIndex++;
    }
  }
  return resultArray
    .concat(left.slice(leftIndex))
    .concat(right.slice(rightIndex));
};
{
  "Args": ["app.js"],
  "Files": [
    "app.js",
    "package.json",
    "./data/randomData.js",
    "./data/array_sample_1.json",
    "./data/array_sample_2.json",
    "./benchmarks/bubbleSort.js",
    "./benchmarks/factorial.js",
    "./benchmarks/fibonacci.js",
    "./benchmarks/hashing.js",
    "./benchmarks/semiPrime.js",
    "./benchmarks/mergeSort.js"
  ],
  "Dirs": ["node_modules", "data", "benchmarks"],
  "RunConfig": {
    "CPUs": 1,
    "Memory": "2G",
    "Accel": true
  }
}

this is the configuration file. running on QEMU.

this is the logic of the code that I am running.

So I added onto your example here with the following:

function generateLargeArray(size, start = 0, step = 1) {
  return Array.from({ length: size }, (_, index) => start + index * step);
}

function shuffle(array) {
  let currentIndex = array.length;

  while (currentIndex != 0) {

    let randomIndex = Math.floor(Math.random() * currentIndex);
    currentIndex--;

    [array[currentIndex], array[randomIndex]] = [
      array[randomIndex], array[currentIndex]];
  }
}

const mergeSort = (arr) => {
  if (arr.length <= 1) return arr;
  const middle = Math.floor(arr.length / 2);
  const left = arr.slice(0, middle);
  const right = arr.slice(middle);
  return merge(mergeSort(left), mergeSort(right));
};

const merge = (left, right) => {
  let resultArray = [], leftIndex = 0, rightIndex = 0;

  while (leftIndex < left.length && rightIndex < right.length) {
    if (left[leftIndex] < right[rightIndex]) {
      resultArray.push(left[leftIndex]);
      leftIndex++;
    } else {
      resultArray.push(right[rightIndex]);
      rightIndex++;
    }
  }
  return resultArray
    .concat(left.slice(leftIndex))
    .concat(right.slice(rightIndex));
}

const ray = generateLargeArray(8000000, 100, 50000);

shuffle(ray);
console.log(ray[0]);
console.log(ray[1]);
console.log("sorting..");

nbob = mergeSort(ray);

console.log(nbob.length);
console.log(nbob[0]);
console.log(nbob[1]);

I presume it’s something like this? I wasn’t able to reproduce what you’re seeing with this but you also mentinoed that you are interacting with a webserver which says to me that it’s probably long lived? If you’re repeatedly allocating and never destroying large arrays like this you might be running out of memory quickly.

I’d ensure that however you’re allocating these they are scoped in a way that the GC is going to reclaim them. You could drill down into this by profiling your memory.

I actually just looked and even calling this once makes it jump from 40mb to ~ 1gb (on linux).

Okay yeah that makes sense, I am using it for a benchmark so it is make sense that it is going to kill the server. What wonders me is I am executing the same algo in docker as well, but around this range worked with 512m memory. I am not sure about the range comparison for docker and this, but I am sure the config memory and docker memory was same and unikernel died before docker.

Just a quick question since I never had to worry about memory occupance (last time I had to worry about memory was 2016, learning C), if I enforce GC, after each run, theoretically that should work right?

Well I was seeing 1G with the code I pasted after just one allocation/shuffle/sort under linux.

You can force GC to be ran but it isn’t going to pick up anything that it can’t reclaim. So if you are creating a global variable for instance - typically GC is not going to reclaim that. You’d want them as a local variable if possible.

The array is scoped in the execution function, in a separate module. So I am invoking the function once the server receives request.
I will try a few more runs. Will write if there is any changes.

I ran two rounds of benchmark, with GC invoked.
Changed the array length range to 500000 - 11500000, but now the interval is geometric . Allocated 2G memory in config file and so far its working perfectly!

Thank you so much for your help!

1 Like