Sometimes using the core functions are just not enough. Sometimes we need more power, we need the ability to hook in custom code and custom business logic.
Spectral does a great job with custom functions, So, vacuum has adopted a very similar design, to facilitate custom functions.
Since v0.3.0 vacuum supports JavaScript based functions as well as golang based functions.
vacuum is written in golang. When building a JavaScript function, we need to remember that the JavaScript code is being read in by vacuum and then parsed and executed by the goja JavaScript engine.
There is no DOM available and there is limited access to node.js APIs. This is not a v8 environment.
Since v0.22.0, vacuum supports async functions and Promises, including the built-in fetch() API for making HTTP requests. You can now write async function runRule(input) and use await inside your custom functions.
Structure of a JavaScript function
vacuum expects two required functions from a JavaScript custom function:
- A
getSchema()function that returns metadata about the function - A
runRule()function that contains the actual validation logic (can be async)
Here’s the basic structure:
// Required: Define the function metadata
function getSchema() {
return {
"name": "myCustomFunction", // This name is used in rulesets
"description": "Validates that input matches expected value"
};
}
// Required: Implement the validation logic
function runRule(input) {
// do something useful with custom logic, this is just a silly example
if (input !== 'some-value') {
return [
{ message: 'something went wrong, input does not match "some-value"' }
];
}
}
In the above example, the runRule function accepts a single argument called input. This is the value of the given property
that was located by the rule.
If the input does not match some-value then the function returns an array of objects, each object is a modelRuleFunctionResult that
contains a message property. This is the message that will be displayed in the linting report.
The input argument
The input can be either a primitive value, or a complex value (like an object or an array). It all depends on how the rule
is configured to use the function, and what the given property is set to.
If the function only works on a single value, then the given property should be set to a JSON Path that points to a single value.
If the function needs to work on an array of values, then the given property should be set to a JSON Path that points to an array, etc.
Access to context
If the function needs to know what rule is calling it, or what the specification looks like, then the runRule function can
access the context object. This is essentially a JSON rendered version of model.RuleFunctionContext.
There are two objects that are not available to JavaScript functions, the index and the document.
The index is not available, because it’s a complex struct with a lot of data in it and a large API. Without a full
replacement SDK, we can’t make this available to JavaScript functions.
The document is not available, because of the same reason as the index.
Example function that uses context
function runRule(input) {
// use context to determine the rule name and the given path
if (input !== 'some-value') {
return [
{ message: 'rule' + context.rule.id + ' failed at ' + context.given }
];
}
}
The context object is automatically available to the function as a global. It does not need to be declared as an argument to the runRule function.
Context type
TypeScript is not (yet) supported, but here is the type definition for the context and related objects would be.
interface Context {
ruleAction: RuleAction;
rule: Rule;
given: any;
options: any;
specInfo: SpecInfo;
}
SpecInfo type
interface SpecInfo {
fileType: string;
format: string;
type: string;
version: string;
}
RuleAction type
interface RuleAction {
field: string;
functionOptions: any;
}
Rule type
interface Rule {
id: string;
description: string;
given: any;
formats: string[];
resolved: boolean;
recommended: boolean;
type: string;
severity: string;
then: any;
ruleCategory: RuleCategory
howToFix: string;
}
RuleCategory type
interface RuleCategory {
id: string;
name: string;
description: string;
}
Access to the function options
All functions can accept options from the rule that is calling them. Options are available from functionOptions value. To access it,
start with the context object, and its under the ruleAction property.
Example rule and function
Let’s configure a ruleset that defines a custom function and passes in some function options.
rules:
my-custom-js-rule:
description: "adding function options to use in JS functions."
given: $
severity: error
then:
function: useFunctionOptions
field: someField
functionOptions:
someOption: someValue
Now we can create a new file called useFunctionOptions.js and write the function logic.
// Required: Define the function metadata
function getSchema() {
return {
"name": "useFunctionOptions",
"description": "Demonstrates using function options"
};
}
// Required: Implement the validation logic
function runRule(input) {
// extract function options from context
const functionOptions = context.ruleAction.functionOptions
// check if the 'someOption' value is set in our options
if (functionOptions.someOption) {
return [
{
message: "someOption is set to " + functionOptions.someOption,
}
];
} else {
return [
{
message: "someOption is not set",
}
];
}
}
The message will result in: someOption is set to someValue
The
givenvalue in a ruleset is a JSON Path.
The getSchema() function (Required)
Since vacuum v0.17.x, the getSchema() function is required for JavaScript custom functions. Without it, your function will not be registered and cannot be used in rulesets.
The getSchema() function returns an object that defines metadata about your custom function, including the name that will be used to reference it in rulesets.
For example:
function getSchema() {
return {
"name": "myFunctionName", // IMPORTANT: This is the name used in rulesets
"description": "What this function does",
"properties": [ // Optional: Define expected properties
{
"name": "propertyName",
"description": "Description of this property"
}
],
};
}
Important:
The name property in the schema is what you’ll use to reference the function in your rulesets. For example, if your schema returns "name": "checkApiVersion", you would use function: checkApiVersion in your ruleset (not the filename).
Calling core functions
Any custom JavaScript function can call any of the core functions.
This is done by adding a prefix vacuum_ to the function name. For example, to call the truthy function, the function name
would be vacuum_truthy. Another function like schema would be vacuum_schema.
Each core function accepts two arguments, the first is the input value, and the second is the context object.
For example:
function runRule(input) {
// create an array to hold the results
let results = vacuum_truthy(input, context);
results.push({
message: "this is a message, added after truthy was called",
});
// return results.
return results;
}
Tutorial
Make sure you have vacuum installed first.
Setting things up
The create a new directory for your functions, and change into it.
Next, build a custom JavaScript function that will be used by vacuum. This is a simple function that checks if there is more than one path in an OpenAPI document.
Implement the logic
Create a new javascript file called checkSinglePath.js and write the function logic.
// Required: Define the function metadata
function getSchema() {
return {
"name": "checkSinglePath", // This name will be used in rulesets
"description": "Checks that only a single path exists in the API"
};
}
// Required: Implement the validation logic
function runRule(input) {
// get the number of keys passed in (each path is a key)
const numPaths = Object.keys(input).length;
if (numPaths > 1) {
return [
{
message: 'More than a single path exists, there are ' + numPaths
}
];
}
}
That’s it! We’re ready to call the function from a rule.
Configuring functions
To use the function will need to call it from a rule. Create a new RuleSet or update an existing one, to include a new rule that calls the new custom function.
Example RuleSet
extends: [[vacuum:oas, off]]
documentationUrl: https://quobix.com/vacuum/rulesets/custom-rulesets
rules:
sample-paths-rule:
description: Load a custom function that checks for a single path
severity: error
recommended: true
formats: [ oas2, oas3 ]
given: $.paths
then:
function: checkSinglePath
howToFix: use a spec with only a single path defined.
It’s critical that the function property in your ruleset’s then object matches the name property returned by your function’s getSchema() method. The filename is no longer used for function registration.
Run vacuum with ‘-f’
To run custom functions, vacuum needs to know where to look for them. There is a global flag -f or --functions that specifies
a path to where your custom function plugin is located.
There should be message informing that vacuum has located a function plugin, and how many functions were loaded.
Example output
When the rule runs, if there is more than a single path in the OpenAPI document, then the function will return a message.
For example:
More than a single path exists, there are 23
Async Functions & Promises
In v0.22.0, vacuum added support for asynchronous JavaScript functions using async/await syntax. It’s
actually pretty cool; I needed to build a custom event loop to power goja.
You can see it here
This enables custom functions to perform time-delayed operations or handle any other asynchronous work.
Writing an Async Function
To write an async custom function, simply declare runRule as an async function:
function getSchema() {
return {
name: "myAsyncFunction",
description: "An async function that does something asynchronous"
};
}
async function runRule(input) {
// You can use await inside async functions
const response = await fetch("https://api.example.com/validate");
const data = await response.json();
if (!data.valid) {
return [{
message: "Validation failed: " + data.reason
}];
}
return [];
}
The key differences:
- Use
async function runRule(input)instead offunction runRule(input) - You can use
awaitto wait for Promises to resolve - The function automatically returns a Promise
fetch() API
Like async functions, in v0.22 vacuum added a
built-in implementation
of the Web Fetch API for making HTTP requests from custom JavaScript functions.
This follows the standard Fetch API with security restrictions to protect against malicious usage.
Basic Usage
async function runRule(input) {
// Simple GET request
const response = await fetch("https://api.example.com/data");
// Check if request was successful
if (!response.ok) {
console.log("Request failed with status: " + response.status);
return [];
}
// Parse JSON response
const data = await response.json();
// Use the data for validation
if (data.someField !== input.expectedValue) {
return [{
message: "Value mismatch: expected " + data.someField
}];
}
return [];
}
fetch() Options
The fetch() function accepts an optional second parameter with request options:
const response = await fetch("https://api.example.com/endpoint", {
method: "POST", // HTTP method: GET, POST, PUT, DELETE, etc.
headers: { // Custom headers
"Content-Type": "application/json",
"Authorization": "Bearer token123"
},
body: JSON.stringify({ // Request body (for POST/PUT)
key: "value"
}),
redirect: "follow" // Redirect mode: "follow", "error", or "manual"
});
Response Object
The Response object returned by fetch() provides several methods and properties:
const response = await fetch("https://api.example.com/data");
// Properties
console.log(response.ok); // true if status 200-299
console.log(response.status); // HTTP status code (200, 404, etc.)
console.log(response.statusText); // Status text ("OK", "Not Found", etc.)
console.log(response.url); // Final URL (after redirects)
console.log(response.redirected); // true if redirected
// Methods to read response body (each can only be called once)
const json = await response.json(); // Parse as JSON
const text = await response.text(); // Read as plain text
// Access headers
const contentType = response.headers.get("Content-Type");
Security Restrictions
For security reasons, fetch() has several important restrictions:
- Only
https://URLs are allowed unless you enable HTTP mode with--allow-http - Requests to localhost, 10.x.x.x, 192.168.x.x, etc. are blocked without
--allow-private-networks - Requests timeout after 30 seconds by default unless changed with
--fetch-timeout - TLS certificate authentication will be enforced unless modified by
--insecure - Response bodies are limited to 10MB
To override these restrictions:
# Allow HTTP (non-HTTPS) URLs (not recommended for production)
vacuum lint spec.yaml --allow-http
# Skip TLS certificate verification for HTTPS requests (different from --allow-http)
vacuum lint spec.yaml --insecure
# Allow requests to private/local networks
vacuum lint spec.yaml --allow-private-networks
# Set custom timeout (in seconds)
vacuum lint spec.yaml --fetch-timeout 60
Important distinction:
--allow-httpallows fetch() to use HTTP (non-HTTPS) URLs--insecureskips TLS certificate verification for HTTPS requests
These flags serve different purposes and can be used independently
Complete Example
I built a fun little API, called the Sentiment Analyzer to demonstrate the power of this new feature.
It takes in descriptions and decides if it’s too sad. If so, it returns a response with tooSad: true being set.
This example validates that API descriptions aren’t overly negative by calling an external sentiment analysis API, you can see the ruleset on github as well as the sentiment check example source.
Note: This example uses per-node mode (the default). For better performance with multiple descriptions, see the Batch Mode section below which reduces 29 API calls to just 1.
This is an actual working API. It uses the VADER analysis technique, except does it much faster, as it’s implemented in pure go.
function getSchema() {
return {
name: "sentimentCheck",
description: "Checks if API descriptions are too negative using sentiment analysis"
};
}
async function runRule(input) {
// skip empty or non-string inputs
if (!input || typeof input !== "string") {
return [];
}
// call sentiment analysis API
const response = await fetch("https://api.pb33f.io/sentiment-check", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
text: input
})
});
if (!response.ok) {
console.log("Sentiment API error: " + response.status);
return [];
}
const data = await response.json();
// check if sentiment is too negative
if (data.sentiment === "negative" && data.score < -0.5) {
const words = data.negativeWords || [];
const wordList = words.map(function(w) {
return "'" + w + "'";
}).join(", ");
let message = "Description is too negative";
if (wordList) {
message += ", words like " + wordList + " lower the overall tone";
}
return [{ message: message }];
}
return [];
}
You can run this yourself. The API is real, if you check out the code and run this from the vacuum root.
vacuum lint model/test_files/sad-spec.yaml -r rulesets/examples/sentiment-check.yaml -f plugin/sample/js -d
You will see custom javascript, using a real HTTP API, configured by a ruleset and returning results as violations.
Error Handling
It’s simple to handle fetch errors:
async function runRule(input) {
try {
const response = await fetch("https://api.example.com/data");
if (!response.ok) {
// Handle HTTP errors (404, 500, etc.)
console.log("API returned status: " + response.status);
return [];
}
const data = await response.json();
// Process data...
} catch (error) {
// Handle network errors, timeouts, etc.
console.log("Fetch error: " + error.message);
return [];
}
}
Fetch Configuration via CLI
The following global flags control fetch() behavior:
--allow-http- Allow fetch() to use HTTP (non-HTTPS) URLs (not recommended for production)--insecure- Skip TLS certificate verification for HTTPS requests (different from –allow-http)--allow-private-networks- Allow fetch() to access localhost and private IP ranges (10.x, 192.168.x, etc.)--fetch-timeout- Set timeout for fetch() requests in seconds (default: 30)--cert-file,--key-file,--ca-file- Configure TLS certificates for HTTPS requests
Note: --allow-http and --insecure serve different purposes:
- Use
--allow-httpto allow HTTP URLs (e.g.,http://example.com) - Use
--insecureto skip certificate verification for HTTPS URLs with invalid/self-signed certificates
Example:
Batch Mode Processing
Since v0.22.0, vacuum supports batch mode for custom JavaScript functions. This allows a single function call to
process ALL JSONPath matches at once, instead of calling the function separately for each match.
Batch mode results MUST include the original input object. If you omit the input field or provide an invalid input
object, vacuum will generate an error result explaining this requirement.
This is essential to allow vacuum to know how to map back your custom violation, to the node located via the given path.
return [{ message: "error", input: inputs[i] }];
Batch mode is only available to custom javascript functions. The same concept does not apply to go custom functions.
When to Use Batch Mode
Batch mode is ideal for functions that:
- Call External APIs
- Have High Overhead
- Need Context Across Matches
- Rate-Limited Services
Enabling Batch Mode
To enable batch mode, add batch: true to the functionOptions in your ruleset:
rules:
my-batch-rule:
description: Process all matches in a single batch
given: $..description
severity: warn
then:
function: myBatchFunction
functionOptions:
batch: true # enable batch mode
Batch Mode Function Signature
When batch mode is enabled, the runRule function receives an array of objects instead of a single value:
async function runRule(inputs) {
// inputs is an array: [{value: "...", index: 0}, {value: "...", index: 1}, ...]
// each input object has:
// - value: The matched value from the JSONPath
// - index: The position in the match list (0-based)
}
Batch Mode Return Format
Functions in batch mode MUST return an array of result objects that include the original input object:
async function runRule(inputs) {
// store inputs for later mapping
var results = [];
// process inputs...
for (var i = 0; i < inputs.length; i++) {
if (hasError(inputs[i].value)) {
results.push({
message: "Error found",
input: inputs[i] // REQUIRED: must include the original input object
});
}
}
return results;
}
Try It Yourself: Batch vs Per-Node Mode
Vacuum includes working examples you can run to see the difference between batch and per-node modes. Both examples use the Sentiment Analyzer API to check if API descriptions are too negative.
Run the per-node mode example (29 API calls, ~3 seconds):
vacuum lint model/test_files/sad-spec.yaml \
-r rulesets/examples/sentiment-check.yaml \
-f plugin/sample/js
Run the batch mode example (1 API call, ~100 milliseconds):
vacuum lint model/test_files/sad-spec.yaml \
-r rulesets/examples/sentiment-check-batch.yaml \
-f plugin/sample/js
Both produce identical results (18 warnings), but batch mode is ~33x faster because it makes a single API call instead of 29 separate calls.
Batch Mode Example Files
The example files are included in the vacuum repository:
- Batch function:
plugin/sample/js/sentiment_check_batch.js - Batch ruleset:
rulesets/examples/sentiment-check-batch.yaml - Per-node function:
plugin/sample/js/sentiment_check.js - Per-node ruleset:
rulesets/examples/sentiment-check.yaml - Test spec:
model/test_files/sad-spec.yaml
Batch Mode Ruleset Configuration
extends: [[vacuum:oas, off]]
rules:
sentiment-check-batch:
description: Check if API descriptions are too negative (batch mode)
severity: warn
formats: [oas2, oas3]
given: $..description
then:
function: sentimentCheckBatch
functionOptions:
batch: true # Process all descriptions at once
Batch Mode Function
function getSchema() {
return {
name: "sentimentCheckBatch",
description: "Checks if API descriptions are too negative using sentiment analysis (batch mode)"
};
}
async function runRule(inputs) {
// build request body and a lookup map: id -> original input object
var requestBody = [];
var idToInput = {};
for (var i = 0; i < inputs.length; i++) {
var input = inputs[i];
if (input.value && typeof input.value === "string") {
requestBody.push({
id: input.index,
description: input.value
});
// store original input for result mapping (REQUIRED for batch mode)
idToInput[input.index] = input;
}
}
// nothing to do
if (requestBody.length === 0) {
return [];
}
// single API call with all descriptions
var response = await fetch("https://api.pb33f.io/sentiment-check", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(requestBody)
});
if (!response.ok) {
console.log("Sentiment API error: " + response.status);
return [];
}
var data = await response.json();
if (!data.results || data.results.length === 0) {
return [];
}
// map API results back to vacuum results
var results = [];
for (var j = 0; j < data.results.length; j++) {
var result = data.results[j];
// only report violations
if (!result.tooSad) {
continue;
}
// format negative words
var words = result.negativeWords || [];
var wordList = words.map(function(w) { return "'" + w + "'"; }).join(", ");
var message = "description is too negative";
if (wordList) {
message += ", words like " + wordList + " are lowering your overall tone";
}
// REQUIRED: return result with original input object for deterministic node mapping
// vacuum uses input.index to map back to the correct node
results.push({
message: message,
input: idToInput[result.id] // maps back to the original node
});
}
return results;
}
Migrating to Batch Mode
If you have an existing per-node function that calls an external API:
Before (Per-Node):
async function runRule(input) {
const response = await fetch("https://api.example.com/check", {
method: "POST",
body: JSON.stringify({ text: input })
});
const result = await response.json();
if (result.hasError) {
return [{ message: result.error }];
}
return [];
}
After (Batch Mode):
async function runRule(inputs) {
// Build batch request and create lookup map
var batch = [];
var idToInput = {};
for (var i = 0; i < inputs.length; i++) {
batch.push({ id: inputs[i].index, text: inputs[i].value });
idToInput[inputs[i].index] = inputs[i]; // Preserve input objects
}
const response = await fetch("https://api.example.com/check-batch", {
method: "POST",
body: JSON.stringify(batch)
});
const results = await response.json();
// Map results back with original input objects (REQUIRED)
return results
.filter(r => r.hasError)
.map(r => ({
message: r.error,
input: idToInput[r.id] // Include original input object
}));
}
And add batch: true to your ruleset’s functionOptions.
