Often our incoming data are not formatted consistently or in a way that our app or database expects them—integers may be quoted and so assumed to be strings, or maybe a date string needs to be in Unix epoch time. I've developed a couple reasonably portable utility functions for JavaScript that help to process raw values and return data consistently and as expected. And because data may be structured or nested, this approach allows for recursive processing.
Scroll to final codeAdding and importing the needed packages
I say "reasonably" because these functions do rely on a few specific imports from third party libraries: lodash, just, and dayjs.
First, we need just one lodash
function: uniq
. But to keep things lightweight, I only want to import that single function, not the entire lodash library. Unfortunately, it's not available, so you have to import the uniqBy
function even though we won't be passing an "iteratee" as an argument:
npm install lodash.uniqby
Next, we need the just-deep-map-values
function from the just
library. There are a lot of other similar tools, but this one is very simple and we only need it for one part of the code. Plus we can just (oh fun with puns!) import the specific function:
npm install just-deep-map-values
Last, I like to use the dayjs
library for handling dates and times. It's lightweight, too, and lets you specifically import/extend with plugins (in this case, we'll need the utc
and the customParseFormat
plugins):
npm install dayjs
With package.json
updated with the tools we need, create or use some utilities file (e.g., helpers.ts
) and import the packages. We also want to import the dayjs
plugins and apply them.
import _uniqBy from "lodash.uniqby";
import deepMapValues from "just-deep-map-values";
import dayjs from "dayjs";
import utc from "dayjs/plugin/utc";
import customParseFormat from "dayjs/plugin/customParseFormat.js";
dayjs.extend(utc);
dayjs.extend(customParseFormat);
Structure the conditional logic of the processor function
Start by assigning the raw input value to the return value, and then creating a set of if/else-if statements that may alter the return value before it's returned. The order, here, does matter—especially once we get into the non-null/-empty/-undefined portions—still, you might want to treat things a little differently than I have. For example, maybe an empty string should still be treated as a string (though in the code below I'm treating it as null). Or, maybe the string "true" should not be converted to a boolean. However, you should check if a value is an integer before checking if it's a number, and both before checking if it's a string to be sure values are caught and processed properly.
...
export const processRawValue = (rawValue: any) => {
let returnValue = rawValue;
if (rawValue === null || rawValue === "") {
returnValue = null;
} else if (rawValue === undefined) {
returnValue = undefined;
} else if (rawValue === "true" || rawValue === true) {
returnValue = 1;
} else if (rawValue === "false" || rawValue === false) {
returnValue = 0;
} else if (Number.isInteger(Number(rawValue))) {
returnValue = parseInt(rawValue, 10);
} else if (!isNaN(Number(rawValue))) {
returnValue = Number(rawValue);
} else if (typeof rawValue === "string") {
returnValue = rawValue.toString().trim();
}
return returnValue;
};
Process objects
What the code above leaves out, however, are cases where the raw input value should be treated as an object or a date. If it's an object, what kind of object is it? An array or a set of key/value pairs? Either way, we need to process it recursively. For arrays, this just means applying the processor to each value and then removing any duplicates. But for objects, there's no easy/native way to traverse the nested structure and apply the processor function, so we turn to the deepMapValues()
function that we imported previously. So, assuming rawValue
is an object:
if (rawValue instanceof Array) {
returnValue = rawValue.map((value) => processRawValue(value));
returnValue = _uniqBy(returnValue);
} else {
returnValue = deepMapValues(rawValue, processRawValue);
}
Process dates
Then, in terms of whether a raw input value should be treated as a date, we just need to amend the string portion of the conditional logic to format as a date if appropriate to do so. For this, though, we'll need to define a formatIfDate()
function in our utilities file that takes in a string in one of any number of formats we might expect and then returns a consistently formatted string or value. Note that we need to specify the list of expected rawFormats because we don't want to accidentally parse a non-date string as a date just because the dayjs
library is capable of doing so:
...
export const formatIfDate = (timestamp) => {
if (timestamp === undefined) {
return null;
}
const rawFormats = ["MMMM D, YYYY", "MMMM D, YYYY - h:mma", "YYYY-MM-DD", "YYYY-MM-DD HH:mm:ss"];
const returnFormat = "YYYY-MM-DD HH:mm:ss";
if (dayjs(timestamp, rawFormats, true).isValid()) {
return dayjs(timestamp).format(returnFormat);
}
return timestamp;
};
In my case, I also wanted to look out for ISO-formatted dates, which may include a timezone offset. This is the most reliable/unambiguous format, and what I (and I would assume most people) prefer when passing around date strings between components, APIs, etc. To do so, we create another function that can check for the format and adjust the value if a timezone is set. The strategy here is to first extract the date/time portion of the string separate from the timezone offset, and then to create a new date object and check whether the ISO-formatted string version of that new date object matches the original value:
...
export const isISOString = (value) => {
if (typeof value === "string") {
let timezoneOffset, dateString;
// If timezone is Z format
if (value && value.endsWith("Z")) {
dateString = value.substring(0, value.length - 1);
timezoneOffset = dayjs().utcOffset();
}
// If timezone is -/+hh:ss format
if (value.slice(-6).startsWith("-") || value.slice(-6).startsWith("+")) {
dateString = value.substring(0, value.length - 6);
timezoneOffset = "+00:00";
}
const date = dayjs(dateString + timezoneOffset);
if (dayjs(date).isValid()) {
const dateIsoFormat = date.toISOString();
let formatMatches = date.toISOString() === value;
if (!formatMatches) {
const dateIsoFormatComparisonString = dateIsoFormat.substring(0, dateIsoFormat.length - 5);
const valueFormatComparisonString = value.substring(0, value.length - 6);
formatMatches = dateIsoFormatComparisonString === valueFormatComparisonString;
}
return formatMatches;
}
}
return false;
};
Update processor function to handle objects and dates
With these pieces now in place, we can update (and rename) the original processor function to handle objects and dates. I also opted to allow for a debug
parameter to be passed in so I can check how my processor function is performing. The final code looks like this:
import _uniqBy from "lodash.uniqby";
import deepMapValues from "just-deep-map-values";
import dayjs from "dayjs";
import utc from "dayjs/plugin/utc";
import customParseFormat from "dayjs/plugin/customParseFormat.js";
dayjs.extend(utc);
dayjs.extend(customParseFormat);
export const processRawValueRecursively = (rawValue: any, debug: boolean) => {
if (debug) console.log({ rawValue });
let returnValue = rawValue;
if (rawValue === null || rawValue === "") {
if (debug) console.log({ type: "null" });
returnValue = null;
} else if (rawValue === undefined) {
if (debug) console.log({ type: "undefined" });
returnValue = undefined;
} else if (typeof rawValue === "object") {
if (rawValue instanceof Array) {
if (debug) console.log({ type: "array" });
returnValue = rawValue.map((value) => processRawValueRecursively(value));
returnValue = _uniqBy(returnValue);
} else {
if (debug) console.log({ type: "object" });
returnValue = deepMapValues(rawValue, processRawValueRecursively);
}
} else if (rawValue === "true" || rawValue === true) {
if (debug) console.log({ type: "boolean (true)" });
returnValue = 1;
} else if (rawValue === "false" || rawValue === false) {
if (debug) console.log({ type: "boolean (false)" });
returnValue = 0;
} else if (Number.isInteger(Number(rawValue))) {
if (debug) console.log({ type: "integer" });
returnValue = parseInt(rawValue, 10);
} else if (!isNaN(Number(rawValue))) {
if (debug) console.log({ type: "number" });
returnValue = Number(rawValue);
} else if (typeof rawValue === "string") {
if (debug) console.log({ type: "string/date" });
returnValue = formatIfDate(rawValue.toString().trim());
} else if (debug) {
console.log({ type: "unknown" });
}
if (debug) console.log({ returnValue });
return returnValue;
};
export const isISOString = (value) => {
if (typeof value === "string") {
let timezoneOffset, dateString;
// If timezone is Z format
if (value && value.endsWith("Z")) {
dateString = value.substring(0, value.length - 1);
timezoneOffset = dayjs().utcOffset();
}
// If timezone is -/+hh:ss format
if (value.slice(-6).startsWith("-") || value.slice(-6).startsWith("+")) {
dateString = value.substring(0, value.length - 6);
timezoneOffset = "+00:00";
}
const date = dayjs(dateString + timezoneOffset);
if (dayjs(date).isValid()) {
const dateIsoFormat = date.toISOString();
let formatMatches = date.toISOString() === value;
if (!formatMatches) {
const dateIsoFormatComparisonString = dateIsoFormat.substring(0, dateIsoFormat.length - 5);
const valueFormatComparisonString = value.substring(0, value.length - 6);
formatMatches = dateIsoFormatComparisonString === valueFormatComparisonString;
}
return formatMatches;
}
}
return false;
};
export const formatIfDate = (timestamp) => {
if (timestamp === undefined) {
return null;
}
const returnFormatString = "YYYY-MM-DD HH:mm:ss";
const possibleDateFormats = ["MMMM D, YYYY", "MMMM D, YYYY - h:mma", "YYYY-MM-DD", "YYYY-MM-DD HH:mm:ss"];
if (dayjs(timestamp, possibleDateFormats, true).isValid() || isISOString(timestamp)) {
return dayjs(timestamp).format(returnFormatString);
}
return timestamp;
};