Stelle
Local Consultant (Expert in Information Collection and Database Production Systems)
UNODC Careers
Wien, Österreich
Sonstiges
Bewerbung
Diese Anzeige stammt von einer externen Quelle. Die Bewerbung erfolgt auf der Website des Arbeitgebers.
Jetzt bewerbenBeschreibung
Result of Service
All products specified herein must be prepared considering the activities described in the previous item. Deliveries must be validated by the consultancy's supervision and, if necessary, revised versions must be resubmitted. PRODUCT 1: Technical report containing a survey of the formats and structures of state reports. The document shall consolidate a mapping of the formats (PDF, DOCX, internal systems, etc.), standards, and information fields used in each federal unit (UF) and in the Federal Police. It shall also contain a comparative table between the UFs and between the UFs and the Federal Police, highlighting similarities and challenges. PRODUCT 2: Technical document containing a map of variables of interest and initial work taxonomy. PRODUCT 3: Technical document containing a methodological proposal for data extraction and structuring. The document should detail the use of OCR, NLP, and standardisation techniques. If possible, it should compare optical character recognition (OCR) results obtained with traditional approaches and with LLMs. It should also contain an illustrated workflow (data pipeline) and the quality and validation criteria for extractions. PRODUCT 4: Technical document containing code (in Python or R, for example) for extracting data from a sample set of reports. The code delivered must have already been tested in different file formats. The document must also contain step-by-step instructions for applying the code. PRODUCT 5: Technical document containing the results of the extraction from real samples of reports from some states and the Federal Police. It should contain an assessment of accuracy, limitations, and necessary adjustments. PRODUCT 6: Technical document containing code revised after sample pre-testing and scripts and technical documentation for the OCR and NLP modules adapted to forensic reports. The document should include text pre-processing (cleaning, tokenisation, and data normalisation). PRODUCT 7: Unified database with information extracted from reports with a relational or document-oriented structure, ready for analysis. It should include data from all states and the Federal District. PRODUCT 8: Technical document containing a dictionary of standardised variables and metadata, with a clear definition of each variable, format, unit of measurement and rules for completion, including details in formats such as JSON Schema (for use in OpenAI and Ollama APIs, for example). PRODUCT 9: Document containing a technical manual and final report on the methodology for replication, including instructions for using the tools, maintenance, and updating. It should also contain an analysis of challenges and recommendations for future expansion.
Work Location
Expected duration
Duties and Responsibilities
**Qualifications/special Skills**
Languages
Additional Information
No Fee
Apply Now
Quelle: https://at.linkedin.com/jobs/view/local-consultant-expert-in-information-collection-and-database-production-systems-at-unodc-careers-4358460264
Anforderungen
Level: senior
Diese Anzeige stammt von einer externen Quelle. Die Bewerbung erfolgt auf der Website des Arbeitgebers.
Jetzt bewerben