Capstone Project: OCR verification of Electrical Schematics in Complex Environments

Capstone Project: OCR verification of Electrical Schematics in Complex Environments


Note

Due to the sensitive nature of this project involving proprietary company materials and trade secrets, the code and research report cannot be shared publicly. However, this provides an overview of my work.

To maintain confidentiality, all images used in this summary are augmented versions of publically available schematics found online (see links below). Some processes have been simplified and numerical values rounded off for security purposes.

  • Schematic Base Image: Here
  • Linkage Base Image: Here

Motivation

At PAR Systems, an automation integrator specializing in custom equipment for industries such as medical, aerospace, nuclear, and more, accurate electrical schematics play a crucial role. Because the machines are custom-designed from the ground up, these documents require thorough review to ensure accuracy. Such designs are essential for minimizing unnecessary costs associated with machine builds.

One of the most time-consuming aspects of verifying these schematics is manually checking cable linkages, a process that can take dozens of hours on larger projects. This tedious task is prone to errors and inaccuracies, which can lead to significant and costly consequences.


Problem Definition

The primary objective of this project was to develop a software solution capable of accurately verifying the existence and integrity of all cable linkages within electrical schematics. The program aimed to achieve an accuracy rate at least as high as human verification (estimated at 80%) while minimizing false positive verifications. Additionally, the program needs to be run completely locally to comply with security concerns.


Proposed Approach

The image below illustrates an example of this proposed solution’s architecture, showcasing how each component works together in series to achieve accurate cable linkage verification.

Pipeline Image

This approach offers several advantages:

  • Efficiency: By utilizing lightweight object detection instead of computationally intensive convolution searches, we can significantly reduce the verification time.
  • Modularity: The framework is designed to be modular, allowing individual components to be updated or replaced without disrupting the entire pipeline. This flexibility enables seamless integration with future improvements and enhancements.

System Process

1. PDF Ingestion and Image Conversion

To simplify the analysis process, the electrical schematics were ingested as PDFs and each page was converted to a high-resolution (300 dpi) image. This format allows for efficient processing and minimizes data loss. The sample schematic below illustrates the type of input our system can handle.

Sample Input Schematic

2. Column and Row Number Extraction

The column and row number extracting algorithm employs advanced image processing techniques to accurately extract column and row numbers from schematics. First, we remove unnecessary padding by cropping images to their borders. Next, Optical Character Recognition (OCR) is applied within a predefined region on each page to identify the column number. Leveraging the rigid template used in all drawings, convolutional methods were utilized to locate two key row numbers. By extrapolating from these anchor points, the location of the remaining row numbers can be accurately determined.

3. YOLO Search: Cable Linkage Detection

A YOLO model was fine-tuned to identify cable linkages – critical callouts that connect wires across different pages of schematics. The example image below demonstrates this feature in action. This capability enables the system to effectively analyze and understand the relationships between various components within electrical diagrams. The YOLO search algorithm proved to be the best agent determining whether a cable linkage was present.

Example Cable Linkage

To address the lack of pre-existing datasets for this task, a custom dataset was created specifically for this project. Comprising approximately 3,300 training samples and 400 test samples, it provided an adequate foundation for model development. The YOLOv11n base model was fine-tuned over five epochs, resulting in the best model tested.

While the initial confidence threshold of 85% yielded only 50% accuracy in identifying cable linkages, subsequent post-processing steps allowed us to relax this requirement and set a lower threshold (30%) without compromising overall system reliability. This approach ensured that YOLO search found significantly more linkages (90+%), even if some false positives occurred.

Example Cable Linkage

4. OCR Linkage Text Extraction and Validation

To extract numerical data from cable linkages, Optical Character Recognition (OCR) techniques were leveraged using two robust models: TesserOCR and EasyOCR. Given the project’s local computation constraints, cloud services were not utilized. To ensure accuracy, only numbers were whitelisted for extraction.

The algorithm ingested images of cable linkages and returned a list of extracted numbers from each model. A comparative analysis was then performed to identify commonalities between the two OCR engine outputs. Numbers appearing in both models with lengths ranging from 3-5 digits were added to a “significant digits” list, while an instance-specific confidence score was assigned based on the similarity of the two OCR engine results.

5. Verification and Data Export

A verification algorithm was run to confirm that each linkage had a valid source-to-destination connection with a return link back to its origin. The raw text data was exported in both CSV format for review and PDF format for visual interpretation. A color-coded indicator system was implemented, where the interior of bounding boxes surrounding verified linkages turned green, indicating confidence in their existence, while unverified connections remained red upon export.

Example Cable Linkage

Main Findings and Experimental Results

  • Strengths:
    • Exceptional recognition of linkages by YOLO model, demonstrating its potential for accurate identification.
    • Image preprocessing achieved remarkable success rates, laying a solid foundation for further development.
  • Challenges and Areas for Improvement:
    • OCR accuracy proved to be the most significant hurdle in our project. The extraction of text from smaller images identified by YOLO search was particularly challenging, resulting in nearly all errors being attributed to incorrect text extraction by OCR algorithms.
    • Adapting to deviations from company standards presented another challenge that required careful consideration.

Despite these challenges, a program was successfully developed capable of identifying and verifying cable linkages computed locally. While the accuracy fell short of our proposed human benchmark (~80%), this project marked an important step towards creating a more efficient system for linkage verification.


Potential Impact

This project marks a significant milestone towards automating internal review processes at PAR, possibly saving hundreds of hours annually and empowering employees with cutting-edge technologies that can drive further process improvements across the company.


Limitations and Future Work

  • OCR Reliability: Despite its potential, the investigation revealed limitations in relying on OCR models for industrial-grade accuracy.
  • Linkage Detection: The YOLO model’s performance fell short of expectations when it came to detecting all instances of linkages within documents.
  • Information Parsing: Efforts were hindered by the proprietary nature of the drawings and a need to utilize free, open-source technologies. This limited the ability to develop more sophisticated information parsing techniques.

Future developments may involve integrating this solution with PAR’s current template using enhanced YOLO models and fine-tuned OCR engines specifically designed for this task. Alternatively, variations of this project could be explored to analyze underlying text data in new PDF documents, rather than converting them into images.


Collaboration

This capstone project was undertaken in collaboration with PAR Systems and the University of Minnesota. I would like to extend sincere gratitude to Sam Johnson from PAR Systems for his invaluable support and guidance, as well as my advisor Vasillios Morellas at the University of Minnesota for his expert mentorship throughout this endeavor.

Alexander Besch
Alexander Besch
Graduate Student & Engineering Intern

My research interests include artifically intelligent agents, intelligent robotics, and manufacturing.