PANACEA defines its typical use and user
WP8 defines its typical user which will perform typical use cases in PANACEA web service platform
The specific use cases of the factory that PANACEA will offer is mainly directed to NLP Based Application Developers. WP8 defined and explained specific tasks and comprising activities to fix criteria for final evaluation.
Typical use cases and operations that PANACEA web services will cover include the following:
Corpus Tasks
- Build a corpus by web crawling
- Process a corpus by different services: sentence-segment it, tokenize / lemmatize / tag it
- Align two parallel texts: on document level, on paragraph level, on sentence level
Dictionary tasks
- Input a corpus for dictionary extraction (general purpose or domain specific)
- Submit a corpus for dictionary gap identification
- Acquire corpora for new / unknown words
- Enlarge a dictionary merging corpus-extracted information (on entry level), on transfer level and annotation level (additional translations)
- Trace word occurrences over time (‘word of the day’)
Extraction tasks
- Send a corpus to extract information items (named entities, or just key terms)
- Build an “Alerting System” (do texts match the alerting profile?) by intercalating a detecting dictionary gaps service
- Construct a workflow for “Topic Assignment” by using services for keyword extraction and training a classifier with pre-annotated data.
Translation Tasks
- Use a crawling system to collect / add corpus data for SMT creation
- Send a corpus to create a Language Model, for specific language, and / or for specific domain
- Send a parallel or aligned corpus to create your Translation Model (new language direction, new specific domain)
* Create / Adapt an®MT dictionary [with translations, with linguistic annotations (monolingual, transfer)]
This information can also be found on PANACEA’S blog where you can also find links to upcoming conferences, progress reports and the thoughts, opinions and commentaries from project members.