Automatic document digitalization for long-term archiving with iSoma

The sooner long-term data archiving is implemented, the more businesses can avoid the risks of losing their valuable assets and quickly utilize them to create new values and advantages. Assisting organizations and businesses to optimize data digitalization with smart and comprehensive processes, FPT has launched the iSOMA data digitalization solution, along with a suite of technological platforms regarding data archiving, management, and utilization.

1. The urgency of data digitalization

Today, data is considered a valuable asset for every organization and business which constantly seek solutions for long-term data archiving and efficient utilization. However, due to various conditions, archiving is still done manually and carries potential risks, such as document degradation or damage due to its nature or the storage environment, leading to the loss of data with NO possibility of recovery.

Different industries have unique data archiving and utilization requirements, necessitating different methods for these processes. Here are a few examples:

  • Chu-Nom, the first writing system of the Vietnamese people to create a difference from HAN (Chinese) characters, was used continuously for nearly a thousand years, from the 10th century to the 19th century. Currently, only about 100 people worldwide can read and write the Nom fluently, and over 90% of Nom bibliographies have not yet been translated into Quoc-Ngu (modern Vietnamese script). This situation highlights the urgent need for a system that supports users in searching, inputting, translating, and storing Nom documents quickly, easily, and accurately, especially when a large portion of Nom documents have existed in various forms such as ancient books, horizontal lacquered boards, parallel sentences, stelae, and bells.
  • Historical documents – Vietnam has a rich history of resistance wars, producing countless articles with significant historical value. However, these documents have deteriorated over time, making it essential to digitalize and archive them for the long term.
  • Central and local authority documents: Despite many years of extensive digital transformation, from the central to local levels, including notable achievements like e-government with an electronic one-stop shop, there remains data from past decades that has not yet been digitalized and archived on digital platforms. This issue presents challenges in searching and processing records and data. To address such risks, the Vietnamese government has issued the Law on Archives No. 01/2011/QH13 and various circulars and decrees to guide its implementation across ministries and departments at all levels.
  • Business and organizational data: In today’s rapidly evolving technological landscape, businesses and organizations must focus on sustainable development and market breakthroughs. Improvement and innovation are therefore key factors to achieve these goals. However, creating new things is challenging in an era where human knowledge appears to have reached its peak due to extensive globalization. For that reason, reusing historical data to drive improvements and generate new ideas, while considering what is likely to happen in the future, can help businesses and organizations keep up with trends and create market breakthroughs. Imagine when basic data (hard copies) is digitalized and archived as metadata, businesses can leverage the latest technologies such as AI, machine learning, and NLP to create effective data utilization models. This not only boosts labor PRODUCTIVITY but also facilitates the creation of entirely new business models, ideas, and even products.

1 1728274344

While traditional archiving takes up space and incurs costs, it also makes it difficult to preserve and retrieve data in the long term.

2. Digitalization and Basic Digitalization Process

Digitalization is the process of converting physical and analog information into digital form. The information is uploaded to a computer system and processed by software, making it easy to store and search.

While digitalization may not always coincide with digital transformation of a business/organization, the former serves as a crucial input for the latter.

To perform digitalization, the following basic steps are typically involved:

  • Step 1: Document collection

This is a crucial step in the digitalization process. It involves gathering all relevant data, including HARD copies that have not been SCANNED and soft copies that have already been SCANNED (such as PDFs and images). These documents can range from text documents like articles and books to even inscriptions on stone stelae.

  • Step 2: Document classification

Documents are meticulously and scientifically classified to ensure accurate assessment of their current state. Suitable scanning methods are then selected for each type of document or material.

  • Step 3: Document scanning

Based on the document classification results, appropriate scanning devices are used for different types of documents. These can include document scanners for various paper sizes (A0, A2, A3, A4), 3D object scanners, and specialized cameras.

  • Step 4: Document checking

After being scanned, documents are checked to compare the accuracy and completeness of the digital scan saved on electronic devices (computers) with the original hard copy. This step confirms whether the scan is satisfactory or needs to be redone.

  • Step 5: Data input, labeling and indexing

Once the scan is confirmed, digitalization personnel will perform labeling and indexing steps according to the organization’s archiving needs.

Note that the input can involve entering essential data fields or re-entering all relevant information.

  • Step 6: Input checking

This step aims to ensure that all data has been entered, labeled, and indexed correctly. During this step, experienced personnel are assigned to perform an acceptance assessment of the input.

  • Step 7: Data export and archiving

At this step, metadata, two-layer PDFs, or other required data formats are exported for archiving and utilization by the organization.

  • Step 8: Data searching and retrieving

Different solutions are designed or existing solutions are applied to meet specific search and retrieval needs of each business or organization. The searching process also becomes easier with current AI technologies, such as GPT-powered chatbots.

2 1728274341 Eng 1729151849

Digitalization steps

Thu thập tài liệu (hình ảnh, pdf,..) Document collection (images, pdfs, etc.)
 Phân loại tài liệu  Document classification
Quét tài liệu Document scanning
 Kiểm tra tài liệu  Document checking
 OCR & Nhập liệu  OCR & Data input
 Kiểm tra nhập liệu  Input checking
 Kết xuất, lưu trữ  Data export and archiving
 Tìm kiếm, truy xuất dữ liệu  Data searching and retrieval

 

3. Automatic digitalization with iSOMA

Digitalizing millions or hundreds of millions of data copies, if done manually following the above steps, will lead to increased costs, posing a significant barrier to the digitalization and digital transformation of businesses and organizations.

FPT’s iSOMA data digitalization solution, therefore, is the KEY to radically address the most fundamental problems in digitalization.

iSOMA leverages technologies from FPT Corporation to enhance digitalization performance, save costs and and increase accuracy.

  • Firstly, iSOMA is designed with full features and excellence so that it can manage and integrate SCANNING devices, facilitating data transfer from SCANS to the WEB platform, allowing digitalization personnel to easily access, organize, edit and verify records.
  • Secondly, the solution is integrated with automatic FORM recognition features, which will support the automation of document arrangement and form classification.
  • In particular, iSOMA applies advanced AI-OCR technology to recognize numbers, letters, and even handwriting with high accuracy. Automation, hence, will be enhanced after the SCANNING, boosting digitalization productivity multiple times over.
  • Finally, iSOMA supports cross-platform integration, making it extremely easy to store and connect to other information archiving and retrieval systems through highly secure and customizable APIs.

ENG 3 1729076386

Digitalization power of iSOMA & FPT digital transformation ecosystem

Long-term archiving of data resources is an urgent need for any organization and business and cannot be ignored. The sooner it is implemented, the more businesses can avoid the risks of losing their valuable data assets and can promptly utilize them again to create new values and advantages. Today’s superior technology allows for more rapid, secured and easier digitalization. Especially with iSOMA and other platforms from FPT Corporation, organizations and businesses will be supported in establishing a comprehensive and robust digital transformation ecosystem.

4 1728274335 Eng 1729151865

Digitalization using iSOMA is an affordable choice

Con người, Thiết bị, Công nghệ Human, Device, Technology
Dữ liệu điện tử Data lake Electronic data Data lake
Tài liệu giấy, hình ảnh, âm thanh Paper documents, images, sound files
Giải pháp số hoá iSOMA iSoma data digitalization solution
Dữ liệu có thể khai thác được Data that can be utilized

 

Exclusively written by FPT IS Technology Expert

Do Xuan Tien

Product Owner of iSoma – Data Digitalization Solution

Share:
Img Contact

Sign up to receive the latest news from FPT IS