( 85%) , , . ', . ETL- (ETL - extract, transfer, load), , , , , , , () . :
;
;
;
' ;
, ';
( );
;
', ;
' ;
.
: 㳿 ; - . () , . [1]. (ETL) , . , OLAP (On Line-Analytical Processing), Text Mining Data Mining, . ' . ' ( ), ' (), ([15]: , ...), (, [16]: , '). ' ' , , ' , 15-30%.
ETL . , ' , NLP (Natural Language Processing), . ' , , . - Semio ., , , . , , . г Text Mining , , , [57].
|
|
Text Mining , , , , . 㳿 Text Mining , . , . Text Mining , ', . , Text Mining , , , , , , .
(Text Mining) (text data mining), ' . Data Mining, Text Mining , , GTE Labs .-, , , , [57]. , ( , , ' ) , , . , [10].
Data Mining Text Mining . mining ( ) . Text Mining , Data Mining. Text Mining ' (, , , ), -, , , , , , , . Text Mining , , , .
|
|
㳿 Text Mining ' , , , . Text Mining .
, Text Mining , ' , [112]. ³ 㳿 Text Mining Data Mining , , , Text Mining , .
, , , , , , , , Web-, 5-8 % [9].
(. 9.19). ' 㳿 , . . , '.
. 9.19. [9]
, :
;
;
;
, (feature extraction);
;
, , (summarization);
(question answering);
(thematic indexing);
(keyword searching);
( ) .
³ Text Mining , () (summarization), , (feature extraction), (clustering), (classification), (question answering), (thematic indexing) (keyword searching). Text Mining 쳿[17] (taxonomies) (thesauri). , 㳺 Text Mining [57]:
|
|
;
쳿 , ;
', ( ) ;
, , .
Text Mining , . , , intranet- Web-, , , . , ' . , ' ; .
, ' , , , . , ' ' . , () '. - . , . IBM http://www.software.ibm.com/data/iminer/fortext. . . , , .
, ' , , ', '. ' ', ', . , , . , .
, ' (, ) . ֳ , . ³ , , '; ' .
|
|
(. 9.19) - , , , , . ³ . ³ , . , , . ³ () , , .
, Text Mining. , , , 볺-. ֳ , . , (, WordStat), Aerotext Business bjects Text Analysis. () ClearForest, IQMen, Smartware, --, , , , ConveraRetrievalWare, Hummingbird KM, IBM Text Miner, Insight Smart Discovery Extraction Server, Ontos Miner, Oracle Text, ODB-Text, TextAnalyst, InfoStream, XANALYS Link Explorer, , , X-Files , [9].
, 볺-, SemioMap 2.0 Entrieva (1998 .). SemioMap [122] , . SemioMap , , ' [10]. Autonomy Knowledge Server[113] , , .
Oracle 7.3.3. Oracle. Oracle9i Oracle Text [126], , [57]. Oracle Text , : ; ; ; ; '.
IBM Intelligent Miner for Text [120] , : Language Identification Tool , ; Categorisation Tool ( ; Clusterisation Tool ( , , ); Feature Extraction Tool ( ( , , ) ; Annotation Tool .
Galaktika-ZOOM [18] 㳿 , . TextAnalyst [124] ; ; ; ; . WebAnalyst [124] Web-, : ; .
|
|
, , , , , . - IBS. web-, , , , , , , , ̲ . . ' , . ' (, , , ) (, , ) '. , , , , .
, Web. - Opinion Mining (OM) ( ) 㳿, , , . , 㳿 Opinion Mining.