中国机械工程 ›› 2020, Vol. 31 ›› Issue (14): 1700-1707,1716.DOI: 10.3969/j.issn.1004-132X.2020.14.009

• 服务型制造 • 上一篇    下一篇

基于JSON文档结构的工业大数据多维分析方法

李敏波1,2, 许鑫星1, 李强3, 韩乐1,4   

  1. 1. 复旦大学软件学院, 上海, 200433;
    2. 复旦大学上海市数据科学重点实验室, 上海, 200433;
    3. 希姆通信息技术(上海)有限公司, 上海, 200335;
    4. 复旦大学青岛研究院, 青岛, 266520
  • 收稿日期:2019-08-05 出版日期:2020-07-25 发布日期:2020-08-26
  • 作者简介:李敏波,男,1970年生,副教授、博士。研究方向为工业大数据与物联网智能信息等。发表论文40余篇。E-mail:limb@fudan.edu.cn。
  • 基金资助:
    国家自然科学基金资助项目(61671157);上海科技创新行动计划资助项目(18511107800);山东省重大科技创新工程项目(2018CXGC0604)

Multi-dimensional Analysis Method of Industrial Big Data Based on JSON Document Structure

LI Minbo1,2, XU Xinxing1, LI Qiang3, HAN Le1,4   

  1. 1. Software School, Fudan University, Shanghai, 200433;
    2. Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, 200433;
    3. Shanghai SIMCom Limited, Shanghai, 200335;
    4. Qingdao Institute of Fudan University, Qingdao, Shandong, 266520
  • Received:2019-08-05 Online:2020-07-25 Published:2020-08-26

摘要: 在智能生产制造过程中,工业数据存在复杂的关联关系和多源异构特性,不断增长的工业大数据使得数据分析与挖掘异常复杂,而传统的基于数据仓库或关系型数据库的工业数据分析方法数据处理不灵活、分析查询效率低。提出了基于JSON文档结构的工业大数据联机分析处理(OLAP)模型架构,使用Key-Value数据的JSON文档结构灵活定义工业数据结构,将维度信息中的表结构转换为基于JSON的文档结构,将事实包含的维度信息通过嵌套文档的方式保存。通过构建分析目标为根节点的文档树,基于Elasticsearch平台储存文档结构树并建立倒排索引,将查询与分析操作转变为文档内容的遍历与查询,使用倒排索引的方法提高数据分析查询的效率。设计了自定义配置检索条件与查询语句的智能解析引擎,实现了工业数据多维分析可视化图表的智能生成。

关键词: 工业大数据, 联机分析处理(OLAP)模型架构, 多维分析, JSON文档结构

Abstract: The growing industrial big data made data analysis and mining extremely complicated since they had complex correlations and multi-source heterogeneous characteristics in the intelligent manufacturing processes. However, the traditional industrial big data analysis approach based on the relational database or data warehouse was not flexible to deal with multi-source heterogeneous data and was less efficient to do search and analysis operations. An OLAP model architecture of industrial big data was presented based on JSON document structure, which might use Key-Value data to flexibly define diverse industrial data structure. The table structure in the dimensional information was converted into a JSON-based document structure, and the dimensional information contained in the fact table was stored by nested document. Then constructing a document tree and the analysis target was the root node, and used Elasticsearch to store the document structure tree and built an inverted index, which might improve the efficiency of the data analysis query. The query and analysis operations were transformed into the traversal and query operations in the document contents. An intelligent parsing engine which might design custom search conditions and query statements was realized to generate intelligent visualization chart related to multidimensional industrial data.

Key words: industrial big data, online analytical processing(OLAP) model architecture, multi-dimensional analysis, JSON document structure

中图分类号: