Big Data Terminology Sets Three World Firsts

– Interview with Professor Lian Yuming, chief editor of Big Data Terminology and director of the Key Laboratory of Big Data Strategy

Editor’s note: Big Data Terminology (20 Volumes), the world’s first smart multilingual terminology series that systematically deals with different aspects of big data, was published in May 2021 by Science Press and distributed globally. To better explore its compilation process, creative content and contemporary value, the reporter conducted an exclusive interview with Professor Lian Yuming, chief editor of the series and director of the Key Laboratory of Big Data Strategy.

Reporter: Big Data Terminology is the world’s first smart multilingual terminology series that systematically deals with different aspects of big data. Could you please introduce the background of its compilation?

Lian Yuming: In today’s world, information technology is getting advanced with each passing day. Digitization, networking, and intelligent technology are developing fast. Following agricultural civilization and industrial civilization, humanity is entering the era of digital civilization and there is an urgent need for rules governing it. As the core area of China’s first national big data comprehensive pilot zone, Guiyang has introduced the concept of big data terminology, launched a big data terminology project, and compiled the world’s first multilingual big data terminology series. These endeavors aim to build a standardized terminology system for big data, to shape the public discourse system, to facilitate communication between industries and sectors, and to help people around the world to grasp the opportunities brought by digitalization, networking, and intelligent technology, and properly handle legislation, security, and governance challenges faced by the big data industry.

As the linguistic representation of scientific knowledge, terminology is a kind of foundational information resource that supports the scientific, technological, and academic discourse system. It is an epitome of what human wisdom has produced and maps out the trajectory of science, technology and human progress. In Erya, China’s first encyclopedic dictionary, 16 of the 19 categories focus on nouns; Dream Pool Essays was the first to give the name shiyou (petroleum), and the term remains in use today. The scientific and technological terms that have been passed down since ancient times reflect the deep thoughts given by sages to naming scientific and technological concepts and have made indelible contributions to cultural inheritance and scientific and technological exchanges in history. Major changes unseen in a century are taking place in our world. Disruptive technologies follow one another, and new concepts, theories and methods continue to emerge. A long-term focus on the standardization of scientific and technological terms in the new era is to serve cutting-edge basic research as well as new scientific fields and systems.

Big Data Terminology is the world’s first smart multilingual terminology series that systematically deals with different aspects of big data. It is compiled by the Key Laboratory of Big Data Strategy, in cooperation with dozens of professional institutions and hundreds of experts in and outside of China. Big Data Terminology claims three world firsts: (i) It represents the world’s first big data terminology system that is unified, standardized, and conforming to international principles, and all entries have been reviewed by an expert panel of China National Committee for Terminology in Science and Technology; (ii) it is the world’s first big data terminology written in 21 languages, including languages used in the countries along the Belt and Road; and (iii) it is the world’s first big data terminology that comes with a multilingual audio database and audio book feature. In light of big data development around the world, Big Data Terminology establishes a standardized system of concepts that covers many languages with the Chinese terminology as the basis, aiming to advance rulemaking and drive innovation and the development of global digital civilization.

Reporter: Big Data Terminology is a major innovation in the frontiers of science and technology. How did your team ensure its content is forward-looking, professional and innovative during the research and compilation process?

Lian Yuming: As you said, Big Data Terminology is an outcome of globalized, interdisciplinary, professional, and open research efforts. It is an important innovation in the frontiers of science and technology, claiming several world firsts. During the research and compilation process, we encountered a lot of challenges. We paid attention to resource integration and mechanism innovation and worked with the most professional teams to control the entire process of research, review, translation, and publication in accordance with the most stringent standards to ensure the content is forward-looking, professional and innovative.

Big Data Terminology is compiled by the Key Laboratory of Big Data Strategy, in cooperation with dozens of professional institutions and hundreds of experts in and outside of China. The five major research centers of the laboratory gave full play to their strength in platforms and the experts contributed their expertise during the research and compilation process. Entries are selected by experts from authoritative databases, including Science Citation Index (SCI), Social Science Citation Index (SSCI), Engineering Index (EI) and Index to Scientific & Technical Proceedings (ISTP). The underlying corpus is formed on the basis of big data literature on the CNKI platform. Great efforts are made to ensure the entries are accurate, scientific, and of great utility based on experts’ research findings. China National Committee for Terminology in Science and Technology organized a team of top experts from the Chinese Academy of Sciences and other related institutes across the world to review the terminology. More than 60 experts from the Chinese Academy of Sciences, the Academy of Military Science, the State Information Center, International Knowledge Centre for Engineering Sciences and Technology (IKCEST) under the Auspices of UNESCO, Peking University, Tsinghua University, Renmin University of China, Zhejiang University, China University of Political Science and Law, University of Wisconsin, China Institute of Communications, China Academy of Information and Communications Technology, and China National Institute of Standardization submitted written review opinions, and more than 10 special review meetings were held. Global Tone Communication Technology Co., Ltd., the only language service provider in China that is rated 5A for its translation service, was responsible for the translation and revision of the series. The company has made great efforts to deliver faithful, reliable and accurate translations. Translations are reviewed strictly to ensure faithfulness to the original texts, accuracy, correctness, and readability. Work has also been done to ensure that entries are ordered and indexed in accordance with clearly defined rules, the terms with Chinese characteristics and the newly created scientific and technical terms are carefully translated on the basis of adequate research, the difficult terms are translated with utmost accuracy, and the culturally sensitive terms are handled correctly. Big Data Terminology is published by Science Press, which in itself is a manifestation of authority. China Science Publishing & Media Ltd. (CSPM) is China’s largest science publishing company and one of the country’s three largest publishing and media groups. As a publishing organization of the Chinese Academy of Sciences, its professionalism, strict quality control, and rigorous publishing process are well-known in the publishing industry. During the publication process, Big Data Terminology was designated as a key national publication project during the 14th Five-Year Plan period (2020-2025). The publishing company has made every effort to ensure professionalism, accuracy and standardization in publication and distribution of the terminology.

Reporter: Big Data Terminology is the world’s first smart multilingual terminology series that deals with different aspects of big data. What are the characteristics and highlights of the terminology?

Lian Yuming: The 20-volume Big Data Terminology series marks another innovation of the National Big Data (Guizhou) Comprehensive Pilot Zone after the original eleven-language edition. It is an important symbol of the growing global influence of China’s innovations in big data theory and standardization. Big Data Terminology has four main characteristics: it is presented as an encyclopedia; it is reviewed by authoritative experts; it covers many languages; and it delivers a smart experience.

Like an encyclopedia, Big Data Terminology deals with different aspects of the subject of big data. It presents information that is considered forward-looking in the global context and organizes such information into nine categories: basics of big data, big data strategy, big data technology, big data economy, big data finance, big data governance, big data standards, big data security, and big data law. Together, they comprise a multilingual academic discourse system that is unified, standardized, and conforming to international principles.

China National Committee for Terminology in Science and Technology has organized a team of top experts from the Chinese Academy of Sciences and other related institutes across the world. The experts reviewed the entries in their respective field according to theCommittee’s Principles and Methods for the Review of Terms in Science and Technology.

As one of its innovative features, this series includes 20 bilingual volumes, each having its content presented in Chinese and another language (Arabic, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Cambodian, Hebrew, Indonesian, Malay, Mongolian, Persian, Serbian, Thai, Turkish and Urdu). It aims to provide knowledge in a convenient, accurate, and smart way, and to further foster the dissemination of the standard big data terminology system among people from different linguistic and geographical backgrounds.

The series includes innovative features such as knowledge graphs, audio books, and platform links, thus allowing readers to have a smart, platform-based and systematic experience. On the basis of continuous research, the Global Sharing Platform for Big Data Language Services was developed. All the multilingual resources and services on this platform are provided for free to people around the world via Big Data Terminology Cloud, Silk Road Big Data Terminology, Big Data Terminology Online, Big Data Terminology Guiyang Index, and Big Data Terminology Library.

Reporter: What will the project team do after the publication of Big Data Terminology?

Lian Yuming: The publication of Big Data Terminology is a milestone for the big data terminology project. On this basis, we are now carrying out follow-up actions in three areas to promote international dissemination and application of the project outcomes and drive the development of global digital civilization.

First, we are developing standards with Big Data Terminology at the core to increase our influence in big data standardization. We will continue to deepen the research on the basis of the eleven-language edition and the 20-volume edition of Big Data Terminology to support the formulation of the Handbook on Standard Big Data Terminology in the Official United Nations Languages, promote communication across languages in terms of big data technology and knowledge, and facilitate connectivity between the world’s major countries. We will also continue to promote international dissemination and application of big data terminology in a wider range and broader fields, and tell stories about China well.

Second, we are building an index system on the basis of the Big Data Terminology Guiyang Index, seeking to present the grand picture of big data development. We will, via the Big Data Blue Paper and based on continuous research efforts, continue to update and release information on the Big Data Terminology Guiyang Index which consists of six sub-indexes: the global digital competitiveness index, big data development index, big data security index, law-based big data governance index, big data financial risk control index, and governance technology index. We will use the indexes to comprehensively measure and evaluate China’s performance in big data development, and support China’s effort to increase competitiveness and participation in world governance.

Third, we are building a system of platforms around the Global Sharing Platform for Big Data Language Services to increase our international influence in the field of big data. We will continue to improve Big Data Terminology Cloud, Silk Road Big Data Terminology, and Big Data Terminology Online which have the big data terminology, audio database, knowledge base, and corpus at the core. We will also build the Big Data Terminology Guiyang Index platform which is supported by the index-related data, and Big Data Terminology Library supported by the knowledge base. Our goal is to build the Global Sharing Platform for Big Data Language Services into a dynamic open-source database and an open platform covering widely spoken languages in the world, to provide knowledge in a convenient, accurate, and smart way, and to promote dissemination and application of “Chinese knowledge” on a global scale.

Reporter: Big Data Terminology has facilitated international communication between industries and sectors. What positive impact will this have on the Data Valley Guiyang, on China and on the world?

Lian Yuming: As far as Guiyang is concerned, it will further promote the visibility and international influence of China Data Valley. In the era of globalization, terminology is playing an important role in scientific and technological competition and the contest for influence. Guiyang is the core area of China’s first big data comprehensive pilot zone and, in conjunction with China National Committee for Terminology in Science and Technology, is the first to carry out innovative research in, and intelligent promotion of, terminology standardization, which plays a unique and irreplaceable role in enabling scientific and technological development and rulemaking. In particular, Big Data Terminology is expected to greatly enhance China’s international influence in the field of big data and relevant rulemaking. Meanwhile, the publication of Big Data Terminology will also support higher-level opening up of Guizhou and Guiyang and help Guiyang become a stronger provincial capital.

From a national perspective, it will help China increase its international influence in the era of digital civilization. Language holds the key to communication and to the future, and breaking down language barriers is crucial for promoting the dissemination of “Chinese knowledge” and increasing international influence in the era of big data. Through Big Data Terminology, we have established a standardized system of concepts that covers many languages with the Chinese terminology as the basis. It aims to promote the standardization of big data terms worldwide, to provide knowledge in a convenient, accurate, and smart way, and to further foster the dissemination of the standard big data terminology system among people from different linguistic and geographical backgrounds. It is an important symbol of the growing international influence of China’s innovations in big data theory and standardization. According to the National Committee for Terminology in Science and Technology, the publication of Big Data Terminology is expected to greatly enhance China’s international influence

in the field of big data and relevant rulemaking, contributing significantly to the high-quality

development of big data in China and around the world, as well as promoting the international

dissemination of big data knowledge and exchange and cooperation in relevant fields.

From a global perspective, it will play a positive role in promoting the Belt and Road Initiative (BRI) and building a community with a shared future for mankind. On the foundation of the eleven-language edition of Big Data Terminology, which is also the first of its kind, this series now covers more languages, systematically presenting its content in Chinese and 20 other languages: Arabic, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Cambodian, Hebrew, Indonesian, Malay, Mongolian, Persian, Serbian, Thai, Turkish, and Urdu. These languages cover the Sino-Tibetan, Indo-European, Altaic, South Asian, Asian-African, and Austronesian language families and can reach out to readers on six continents around the world, promote international dissemination and application of big data, especially in countries along the Belt and Road, and drive the development of the digital economy. The Alliance of International Science Organizations (ANSO) believes that the series provides an authoritative template for people in the BRI countries and regions, as well as the world, to better understand a digital China and that it embodies a care for the shared future of mankind..

Disclaimer: The views, suggestions, and opinions expressed here are the sole responsibility of the experts. No Boston New Times  journalist was involved in the writing and production of this article.