Automatic document digitalization for long-term archiving with iSoma
The sooner long-term data archiving is implemented, the more businesses can avoid the risks of losing their valuable assets and quickly utilize them to create new values and advantages. Assisting organizations and businesses to optimize data digitalization with smart and comprehensive processes, FPT has launched the iSOMA data digitalization solution, along with a suite of technological platforms regarding data archiving, management, and utilization.
1. The urgency of data digitalization
Today, data is considered a valuable asset for every organization and business which constantly seek solutions for long-term data archiving and efficient utilization. However, due to various conditions, archiving is still done manually and carries potential risks, such as document degradation or damage due to its nature or the storage environment, leading to the loss of data with NO possibility of recovery.
Different industries have unique data archiving and utilization requirements, necessitating different methods for these processes. Here are a few examples:
- Chu-Nom, the first writing system of the Vietnamese people to create a difference from HAN (Chinese) characters, was used continuously for nearly a thousand years, from the 10th century to the 19th century. Currently, only about 100 people worldwide can read and write the Nom fluently, and over 90% of Nom bibliographies have not yet been translated into Quoc-Ngu (modern Vietnamese script). This situation highlights the urgent need for a system that supports users in searching, inputting, translating, and storing Nom documents quickly, easily, and accurately, especially when a large portion of Nom documents have existed in various forms such as ancient books, horizontal lacquered boards, parallel sentences, stelae, and bells.
- Historical documents – Vietnam has a rich history of resistance wars, producing countless articles with significant historical value. However, these documents have deteriorated over time, making it essential to digitalize and archive them for the long term.
- Central and local authority documents: Despite many years of extensive digital transformation, from the central to local levels, including notable achievements like e-government with an electronic one-stop shop, there remains data from past decades that has not yet been digitalized and archived on digital platforms. This issue presents challenges in searching and processing records and data. To address such risks, the Vietnamese government has issued the Law on Archives No. 01/2011/QH13 and various circulars and decrees to guide its implementation across ministries and departments at all levels.
- Business and organizational data: In today’s rapidly evolving technological landscape, businesses and organizations must focus on sustainable development and market breakthroughs. Improvement and innovation are therefore key factors to achieve these goals. However, creating new things is challenging in an era where human knowledge appears to have reached its peak due to extensive globalization. For that reason, reusing historical data to drive improvements and generate new ideas, while considering what is likely to happen in the future, can help businesses and organizations keep up with trends and create market breakthroughs. Imagine when basic data (hard copies) is digitalized and archived as metadata, businesses can leverage the latest technologies such as AI, machine learning, and NLP to create effective data utilization models. This not only boosts labor PRODUCTIVITY but also facilitates the creation of entirely new business models, ideas, and even products.
While traditional archiving takes up space and incurs costs, it also makes it difficult to preserve and retrieve data in the long term.
2. Digitalization and Basic Digitalization Process
Digitalization is the process of converting physical and analog information into digital form. The information is uploaded to a computer system and processed by software, making it easy to store and search.
While digitalization may not always coincide with digital transformation of a business/organization, the former serves as a crucial input for the latter.
To perform digitalization, the following basic steps are typically involved:
- Step 1: Document collection
This is a crucial step in the digitalization process. It involves gathering all relevant data, including HARD copies that have not been SCANNED and soft copies that have already been SCANNED (such as PDFs and images). These documents can range from text documents like articles and books to even inscriptions on stone stelae.
- Step 2: Document classification
Documents are meticulously and scientifically classified to ensure accurate assessment of their current state. Suitable scanning methods are then selected for each type of document or material.
- Step 3: Document scanning
Based on the document classification results, appropriate scanning devices are used for different types of documents. These can include document scanners for various paper sizes (A0, A2, A3, A4), 3D object scanners, and specialized cameras.
- Step 4: Document checking
After being scanned, documents are checked to compare the accuracy and completeness of the digital scan saved on electronic devices (computers) with the original hard copy. This step confirms whether the scan is satisfactory or needs to be redone.
- Step 5: Data input, labeling and indexing
Once the scan is confirmed, digitalization personnel will perform labeling and indexing steps according to the organization’s archiving needs.
Note that the input can involve entering essential data fields or re-entering all relevant information.
- Step 6: Input checking
This step aims to ensure that all data has been entered, labeled, and indexed correctly. During this step, experienced personnel are assigned to perform an acceptance assessment of the input.
- Step 7: Data export and archiving
At this step, metadata, two-layer PDFs, or other required data formats are exported for archiving and utilization by the organization.
- Step 8: Data searching and retrieving
Different solutions are designed or existing solutions are applied to meet specific search and retrieval needs of each business or organization. The searching process also becomes easier with current AI technologies, such as GPT-powered chatbots.
Digitalization steps
Thu thập tài liệu (hình ảnh, pdf,..) | Document collection (images, pdfs, etc.) |
Phân loại tài liệu | Document classification |
Quét tài liệu | Document scanning |
Kiểm tra tài liệu | Document checking |
OCR & Nhập liệu | OCR & Data input |
Kiểm tra nhập liệu | Input checking |
Kết xuất, lưu trữ | Data export and archiving |
Tìm kiếm, truy xuất dữ liệu | Data searching and retrieval |
3. Automatic digitalization with iSOMA
Digitalizing millions or hundreds of millions of data copies, if done manually following the above steps, will lead to increased costs, posing a significant barrier to the digitalization and digital transformation of businesses and organizations.
FPT’s iSOMA data digitalization solution, therefore, is the KEY to radically address the most fundamental problems in digitalization.
iSOMA leverages technologies from FPT Corporation to enhance digitalization performance, save costs and and increase accuracy.
- Firstly, iSOMA is designed with full features and excellence so that it can manage and integrate SCANNING devices, facilitating data transfer from SCANS to the WEB platform, allowing digitalization personnel to easily access, organize, edit and verify records.
- Secondly, the solution is integrated with automatic FORM recognition features, which will support the automation of document arrangement and form classification.
- In particular, iSOMA applies advanced AI-OCR technology to recognize numbers, letters, and even handwriting with high accuracy. Automation, hence, will be enhanced after the SCANNING, boosting digitalization productivity multiple times over.
- Finally, iSOMA supports cross-platform integration, making it extremely easy to store and connect to other information archiving and retrieval systems through highly secure and customizable APIs.
Digitalization power of iSOMA & FPT digital transformation ecosystem
Long-term archiving of data resources is an urgent need for any organization and business and cannot be ignored. The sooner it is implemented, the more businesses can avoid the risks of losing their valuable data assets and can promptly utilize them again to create new values and advantages. Today’s superior technology allows for more rapid, secured and easier digitalization. Especially with iSOMA and other platforms from FPT Corporation, organizations and businesses will be supported in establishing a comprehensive and robust digital transformation ecosystem.
Digitalization using iSOMA is an affordable choice
Con người, Thiết bị, Công nghệ | Human, Device, Technology |
Dữ liệu điện tử Data lake | Electronic data Data lake |
Tài liệu giấy, hình ảnh, âm thanh | Paper documents, images, sound files |
Giải pháp số hoá iSOMA | iSoma data digitalization solution |
Dữ liệu có thể khai thác được | Data that can be utilized |
Exclusively written by FPT IS Technology Expert
Do Xuan Tien Product Owner of iSoma – Data Digitalization Solution |