How to transform JSON to a simple structure?

From the source I am getting a JSON file with a nested structure.

[
  {
    "query": {
      "ids": [
        8016676
      ],
      "dimensions": [
        "ym:pv:date",
        "ym:pv:URLDomain",
        "ym:pv:URLPath"
      ],
      "metrics": [
        "ym:pv:users"
      ],
      "sort": [
        "-ym:pv:users"
      ],
      "date1": "2019-01-01",
      "date2": "2019-01-02",
      "filters": "ym:s:lastTrafficSource == 'organic'",
      "limit": 10,
      "offset": 1,
      "group": "Week",
      "auto_group_size": "1",
      "attr_name": "",
      "quantile": "50",
      "offline_window": "21",
      "attribution": "LastSign",
      "currency": "RUB",
      "adfox_event_id": "0"
    },
    "data": [
      {
        "dimensions": [
          {
            "name": "2019-01-02"
          },
          {
            "name": "test.ru",
            "favicon": "test.ru"
          },
          {
            "name": "/info",
            "favicon": "test.ru"
          }
        ],
        "metrics": [
          1845
        ]
      },
      {
        "dimensions": [
          {
            "name": "2019-01-02"
          },
          {
            "name": "test.ru",
            "favicon": "test.ru"
          },
          {
            "name": "/login",
            "favicon": "test.ru"
          }
        ],
        "metrics": [
          454
        ]
      }
  }
]

To convert it to CSV format and then save it to the database, I use a loop.

The problem is that I need to regularly save 1 million rows to the database. Cycle converting the file is taking too long.

Can you please tell me how can I change the original file more quickly?

You might look at using generated columns for this.

I have a project that routinely is ingesting data from remote api’s and rather than pick and choose what I store, I now simply store the entire json in a single column in the database and then create generated columns that pick and choose the data I need. Among other things, I like it because I have access to all the data, so if the business needs change in the future, I can simply modify the generated columns, or add new ones without having to reimport.

Perhaps if you don’t have to spin through each record, you will get to a reasonable level of performance.

I haven’t looked at your specific mapping, but I would venture to say you’d be able to map within the db generated columns.

2 Likes

@mebeingken, I understand correctly that you store each json response in a separate cell in a separate table. And then with a SQL query parse this cell and save the data to a new table?

I store the Jason in a single field, and then create additional columns that automatically extract data points from the json. These generated columns are not updated directly via sql inserts and updates, but rather they reflect data that exists in other columns of the same row.

When I was doing a similar task in n8n, I used the following code to convert JSON.

return $json["data"].map(item => {
  return {
    json: item
  }
});

@mebeingken , tell me, please, how can I use it in Wappler?

P.S. The code was not written by me, it was written to me by members of the n8n community. I, unfortunately, cannot program.

You could probably do it with a repeater etc., but with this kind of custom transformation I would suggest to create a custom action. What is the server model (ASP/PHP/Node) that you need it for?

I am using NodeJS. Patrick, can you please tell me how to create the required custom action?

That will require programming.

The code for the action would look like:

exports.transform = function(options) {
  return options.data.map(data => {
    return {
      date: data.dimensions[0].name,
      URLDomain: data.dimensions[1].name,
      URLPath: data.dimensions[2].name,
      users: metrics[0]
    }
  })
}

Patrick, I can’t even get it with your code :frowning:

JS file. (options.data) is a variable

exports.transform0 = function(options) {
  return (options.data).map(data => {
    return {
      date: data.dimensions[0].name,
      URLDomain: data.dimensions[1].name,
      URLPath: data.dimensions[2].name,
      users: metrics[0]
    }
  })
}

Hjson file

{
  type: 'customzero',
  module : 'ymtransform0',
  action : 'transform0',
  groupTitle : 'Import data',
  groupIcon : 'fas fa-lg fa-random comp-data',
  title : 'Yandex transform0 @@var(actionName)@@',
  icon : 'fas fa-lg fa-random comp-data',
  dataScheme: [],
  dataPickObject: true,
  properties : [
    {
      group: 'Properties',
      variables: [
        {
          name: 'actionName',
          optionName: 'name',
          title: 'Name',
          type: 'text',
          required: true,
          defaultValue: ''
        },
        { 
          name: 'data', 
          optionName: 'data', 
          title: 'Data', 
          type: 'text', 
          required: true, 
          serverDataBindings: true,
          defaultValue: ''
        },		
        {
          name: 'actionOutput',
          optionName: 'output',
          title: 'Output',
          type: 'boolean',
          defaultValue: false
        }
      ]
    }
  ]
}

Error

What did you pass as the data? My sample code would accept the data array with the dimensions and metrics object as option. I think in your case {{yandex_metrika_check_test.data.data}}.

I transfer the following data to the module, as in the screenshots.


There is a nested data property {{yandex_metrika_check_test.data.data}}, so 2x .data.

Unfortunately it didn’t help.

@patrick, tell me, please, do you have another idea how to solve this problem?

I think the parse is probably missing in the transform function, since you pass the data as an expression string it must be parsed first.

exports.transform0 = function(options) {
  options = this.parse(options);
  return (options.data).map(data => {
    return {
      date: data.dimensions[0].name,
      URLDomain: data.dimensions[1].name,
      URLPath: data.dimensions[2].name,
      users: metrics[0]
    }
  })
}
1 Like

Patrick, thank you! The module works correctly and is very fast compared to the loop.
The only thing is, there is an error in your code, you need to put “data” before “metrics”.

@patrick , can you tell me how to change the module so that it takes the key names from the “query” object and is not tied to a clear sequence and number of values? I just have situations where the set of measurements will be different.