Formatting raw values consistently in JavaScript using a recursive function

December 26, 2023

Formatting raw values consistently in JavaScript using a recursive function

December 26, 2023

Often our incoming data are not formatted consistently or in a way that our app or database expects them—integers may be quoted and so assumed to be strings, or maybe a date string needs to be in Unix epoch time. I've developed a couple reasonably portable utility functions for JavaScript that help to process raw values and return data consistently and as expected. And because data may be structured or nested, this approach allows for recursive processing.

Scroll to final code

Adding and importing the needed packages

I say "reasonably" because these functions do rely on a few specific imports from third party libraries: lodash, just, and dayjs.

First, we need just one lodash function: uniq. But to keep things lightweight, I only want to import that single function, not the entire lodash library. Unfortunately, it's not available, so you have to import the uniqBy function even though we won't be passing an "iteratee" as an argument:

npm install lodash.uniqby

Next, we need the just-deep-map-values function from the just library. There are a lot of other similar tools, but this one is very simple and we only need it for one part of the code. Plus we can just (oh fun with puns!) import the specific function:

npm install just-deep-map-values

Last, I like to use the dayjs library for handling dates and times. It's lightweight, too, and lets you specifically import/extend with plugins (in this case, we'll need the utc and the customParseFormat plugins):

npm install dayjs

With package.json updated with the tools we need, create or use some utilities file (e.g., helpers.ts) and import the packages. We also want to import the dayjs plugins and apply them.

helpers.ts
import _uniqBy from "lodash.uniqby";
import deepMapValues from "just-deep-map-values";
import dayjs from "dayjs";
import utc from "dayjs/plugin/utc";
import customParseFormat from "dayjs/plugin/customParseFormat.js";

dayjs.extend(utc);
dayjs.extend(customParseFormat);

Structure the conditional logic of the processor function

Start by assigning the raw input value to the return value, and then creating a set of if/else-if statements that may alter the return value before it's returned. The order, here, does matter—especially once we get into the non-null/-empty/-undefined portions—still, you might want to treat things a little differently than I have. For example, maybe an empty string should still be treated as a string (though in the code below I'm treating it as null). Or, maybe the string "true" should not be converted to a boolean. However, you should check if a value is an integer before checking if it's a number, and both before checking if it's a string to be sure values are caught and processed properly.

helpers.ts
...

export const processRawValue = (rawValue: any) => {
  let returnValue = rawValue;

  if (rawValue === null || rawValue === "") {
    returnValue = null;
  } else if (rawValue === undefined) {
    returnValue = undefined;
  } else if (rawValue === "true" || rawValue === true) {
    returnValue = 1;
  } else if (rawValue === "false" || rawValue === false) {
    returnValue = 0;
  } else if (Number.isInteger(Number(rawValue))) {
    returnValue = parseInt(rawValue, 10);
  } else if (!isNaN(Number(rawValue))) {
    returnValue = Number(rawValue);
  } else if (typeof rawValue === "string") {
    returnValue = rawValue.toString().trim();
  }

  return returnValue;
};

Process objects

What the code above leaves out, however, are cases where the raw input value should be treated as an object or a date. If it's an object, what kind of object is it? An array or a set of key/value pairs? Either way, we need to process it recursively. For arrays, this just means applying the processor to each value and then removing any duplicates. But for objects, there's no easy/native way to traverse the nested structure and apply the processor function, so we turn to the deepMapValues() function that we imported previously. So, assuming rawValue is an object:

if (rawValue instanceof Array) {
  returnValue = rawValue.map((value) => processRawValue(value));
  returnValue = _uniqBy(returnValue);
} else {
  returnValue = deepMapValues(rawValue, processRawValue);
}

Process dates

Then, in terms of whether a raw input value should be treated as a date, we just need to amend the string portion of the conditional logic to format as a date if appropriate to do so. For this, though, we'll need to define a formatIfDate() function in our utilities file that takes in a string in one of any number of formats we might expect and then returns a consistently formatted string or value. Note that we need to specify the list of expected rawFormats because we don't want to accidentally parse a non-date string as a date just because the dayjs library is capable of doing so:

helpers.ts
...

export const formatIfDate = (timestamp) => {
  if (timestamp === undefined) {
    return null;
  }

  const rawFormats = ["MMMM D, YYYY", "MMMM D, YYYY - h:mma", "YYYY-MM-DD", "YYYY-MM-DD HH:mm:ss"];
  const returnFormat = "YYYY-MM-DD HH:mm:ss";

  if (dayjs(timestamp, rawFormats, true).isValid()) {
    return dayjs(timestamp).format(returnFormat);
  }

  return timestamp;
};

In my case, I also wanted to look out for ISO-formatted dates, which may include a timezone offset. This is the most reliable/unambiguous format, and what I (and I would assume most people) prefer when passing around date strings between components, APIs, etc. To do so, we create another function that can check for the format and adjust the value if a timezone is set. The strategy here is to first extract the date/time portion of the string separate from the timezone offset, and then to create a new date object and check whether the ISO-formatted string version of that new date object matches the original value:

helpers.ts
...

export const isISOString = (value) => {
  if (typeof value === "string") {
    let timezoneOffset, dateString;

    // If timezone is Z format
    if (value && value.endsWith("Z")) {
      dateString = value.substring(0, value.length - 1);
      timezoneOffset = dayjs().utcOffset();
    }

    // If timezone is -/+hh:ss format
    if (value.slice(-6).startsWith("-") || value.slice(-6).startsWith("+")) {
      dateString = value.substring(0, value.length - 6);
      timezoneOffset = "+00:00";
    }

    const date = dayjs(dateString + timezoneOffset);

    if (dayjs(date).isValid()) {
      const dateIsoFormat = date.toISOString();

      let formatMatches = date.toISOString() === value;
      if (!formatMatches) {
        const dateIsoFormatComparisonString = dateIsoFormat.substring(0, dateIsoFormat.length - 5);
        const valueFormatComparisonString = value.substring(0, value.length - 6);
        formatMatches = dateIsoFormatComparisonString === valueFormatComparisonString;
      }

      return formatMatches;
    }
  }
  return false;
};

Update processor function to handle objects and dates

With these pieces now in place, we can update (and rename) the original processor function to handle objects and dates. I also opted to allow for a debug parameter to be passed in so I can check how my processor function is performing. The final code looks like this:

helpers.ts
import _uniqBy from "lodash.uniqby";
import deepMapValues from "just-deep-map-values";
import dayjs from "dayjs";
import utc from "dayjs/plugin/utc";
import customParseFormat from "dayjs/plugin/customParseFormat.js";

dayjs.extend(utc);
dayjs.extend(customParseFormat);

export const processRawValueRecursively = (rawValue: any, debug: boolean) => {
  if (debug) console.log({ rawValue });

  let returnValue = rawValue;

  if (rawValue === null || rawValue === "") {
    if (debug) console.log({ type: "null" });
    returnValue = null;
  } else if (rawValue === undefined) {
    if (debug) console.log({ type: "undefined" });
    returnValue = undefined;
  } else if (typeof rawValue === "object") {
    if (rawValue instanceof Array) {
      if (debug) console.log({ type: "array" });
      returnValue = rawValue.map((value) => processRawValueRecursively(value));
      returnValue = _uniqBy(returnValue);
    } else {
      if (debug) console.log({ type: "object" });
      returnValue = deepMapValues(rawValue, processRawValueRecursively);
    }
  } else if (rawValue === "true" || rawValue === true) {
    if (debug) console.log({ type: "boolean (true)" });
    returnValue = 1;
  } else if (rawValue === "false" || rawValue === false) {
    if (debug) console.log({ type: "boolean (false)" });
    returnValue = 0;
  } else if (Number.isInteger(Number(rawValue))) {
    if (debug) console.log({ type: "integer" });
    returnValue = parseInt(rawValue, 10);
  } else if (!isNaN(Number(rawValue))) {
    if (debug) console.log({ type: "number" });
    returnValue = Number(rawValue);
  } else if (typeof rawValue === "string") {
    if (debug) console.log({ type: "string/date" });
    returnValue = formatIfDate(rawValue.toString().trim());
  } else if (debug) {
    console.log({ type: "unknown" });
  }

  if (debug) console.log({ returnValue });

  return returnValue;
};

export const isISOString = (value) => {
  if (typeof value === "string") {
    let timezoneOffset, dateString;

    // If timezone is Z format
    if (value && value.endsWith("Z")) {
      dateString = value.substring(0, value.length - 1);
      timezoneOffset = dayjs().utcOffset();
    }

    // If timezone is -/+hh:ss format
    if (value.slice(-6).startsWith("-") || value.slice(-6).startsWith("+")) {
      dateString = value.substring(0, value.length - 6);
      timezoneOffset = "+00:00";
    }

    const date = dayjs(dateString + timezoneOffset);

    if (dayjs(date).isValid()) {
      const dateIsoFormat = date.toISOString();

      let formatMatches = date.toISOString() === value;
      if (!formatMatches) {
        const dateIsoFormatComparisonString = dateIsoFormat.substring(0, dateIsoFormat.length - 5);
        const valueFormatComparisonString = value.substring(0, value.length - 6);
        formatMatches = dateIsoFormatComparisonString === valueFormatComparisonString;
      }

      return formatMatches;
    }
  }
  return false;
};

export const formatIfDate = (timestamp) => {
  if (timestamp === undefined) {
    return null;
  }

  const returnFormatString = "YYYY-MM-DD HH:mm:ss";
  const possibleDateFormats = ["MMMM D, YYYY", "MMMM D, YYYY - h:mma", "YYYY-MM-DD", "YYYY-MM-DD HH:mm:ss"];

  if (dayjs(timestamp, possibleDateFormats, true).isValid() || isISOString(timestamp)) {
    return dayjs(timestamp).format(returnFormatString);
  }

  return timestamp;
};