Unsupervised-Based Information Extraction from Unstructured Arabic Legal Documents
In order to make the unstructured or semi-structured traditional legal texts that are meet the requirements of high-level application such as A.I appli- cations in legal, must overcoming on challenge how to extract and analyze structured information from the legal documents automatically. This paper proposes architecture that using a combined approach that utilizes features, lexical and rules based approaches to extract the needed information from traditional legal documents. This research uses a dataset that is collected from Iraq federal court of cassation decisions documents to extract two sets of in- formation, the first is a set of general information, including reference law category, date of decision, court of jurisdiction name, and document no., deci- sion type that are called valuables attributes information, and the document essence is a focused legal information that include principle, arguments, opin- ions legal, and facts of the case which can used in any analysis phase. This research is a part of big project entitled "The Arabic documents opinion ex- traction using argumentation mining", and the preliminary results were quite promising.
