Moskva, Russian Federation
Russian Federation
The article discusses the process of automating the extraction and analysis of tabular data that contain the characteristics of electronic components. The problem of extracting data from PDF documents manually is emphasized. The relevance of this work is due to the need to use the data presented in PDF format by the designers of a modern enterprise using laser technologies to measure distances with high accuracy. However, the process of extracting and analyzing data from PDF documents is difficult due to the peculiarities of storing technical data on the characteristics of electronic components and the lack of effective tools for reading and converting information. The paper proposes a solution based on the development of Python scripts to automate the process of extracting and analyzing tabular data from PDF documents. This allows you to extract data from recognized tables and convert them into a format that is convenient for further processing.
PDF, documents, table recognition, script, Python
1. Zagorodnikov M. V. Restoration of the text layer of PDF documents with a complex background / M. V. Zagorodnikov, A. A. Mikhailov // Proceedings of the Institute of System Programming of the Russian Academy of Sciences. – 2024. – Vol. 36, No. 3. – pp. 189-202. DOI: https://doi.org/10.15514/ISPRAS-2024-36(3)-13; EDN: https://elibrary.ru/IQIRMY
2. Kalachinsky A.V. Generation of descriptions of automatic programs in a PDF document / A.V. Kalachinsky, I. N. Yashchenko // Information management and processing systems. – 2019. – № 1(44). – pp. 93-98. EDN: https://elibrary.ru/FGGIRD
3. Ogaltsov A.V. Automatic extraction of metadata from scientific PDF documents / A.V. Ogaltsov, O. Y. Bakhteev // Informatics and its applications. – 2018. – Vol. 12, No. 2. – pp. 75-82. DOIhttps://doi.org/10.14357/19922264180211. EDN: https://elibrary.ru/XROLVB
4. Tronin V. G. Automation of the process of extracting pages from a pdf document / V. G. Tronin, A. O. Kuranov // Bulletin of the Ulyanovsk State Technical University. – 2018. – № 3(83). – pp. 31-38. EDN: https://elibrary.ru/YOIUJN



