import termextract.english_plaintext
import termextract.core
from pprint import pprint # このサンプルでの処理結果の整形表示のため
f = open("eng_sample_s.txt", "r", encoding="utf-8")
text = f.read()
f.close
print(text)
Artificial intelligence (AI) is the intelligence exhibited by machines. In computer science, an ideal "intelligent" machine is a flexible rational agent that perceives its environment and takes actions that maximize its chance of success at an arbitrary goal.[1] Colloquially, the term "artificial intelligence" is likely to be applied when a machine uses cutting-edge techniques to competently perform or mimic "cognitive" functions that we intuitively associate with human minds, such as "learning" and "problem solving".[2] The colloquial connotation, especially among the public, associates artificial intelligence with machines that are "cutting-edge" (or even "mysterious"). This subjective borderline around what constitutes "artificial intelligence" tends to shrink over time; for example, optical character recognition is no longer perceived as an exemplar of "artificial intelligence" as it is nowadays a mundane routine technology.[3] Modern examples of AI include computers that can beat professional players at Chess and Go, and self-driving cars that navigate crowded city streets. AI research is highly technical and specialized, and is deeply divided into subfields that often fail to communicate with each other.[4] Some of the division is due to social and cultural factors: subfields have grown up around particular institutions and the work of individual researchers. AI research is also divided by several technical issues. Some subfields focus on the solution of specific problems. Others focus on one of several possible approaches or on the use of a particular tool or towards the accomplishment of particular applications. The central problems (or goals) of AI research include reasoning, knowledge, planning, learning, natural language processing (communication), perception and the ability to move and manipulate objects.[5] General intelligence is still among the field's long-term goals.[6] Currently popular approaches include statistical methods, computational intelligence and traditional symbolic AI. There are a large number of tools used in AI, including versions of search and mathematical optimization, logic, methods based on probability and economics, and many others. The AI field is interdisciplinary, in which a number of sciences and professions converge, including computer science, mathematics, psychology, linguistics, philosophy and neuroscience, as well as other specialized fields such as artificial psychology. The field was founded on the claim that a central property of humans, human intelligence—the sapience of Homo sapiens sapiens—"can be so precisely described that a machine can be made to simulate it."[7] This raises philosophical arguments about the nature of the mind and the ethics of creating artificial beings endowed with human-like intelligence, issues which have been explored by myth, fiction and philosophy since antiquity.[8] Artificial intelligence has been the subject of tremendous optimism[9] but has also suffered stunning setbacks.[10] Today AI techniques have become an essential part of the technology industry, providing the heavy lifting for many of the most challenging problems in computer science.[11]
frequency = termextract.english_plaintext.cmp_noun_dict(text)
pprint(frequency)
#term_list = termextract.english_plaintext.cmp_noun_list(text)
#pprint(term_list)
{'AI': 1, 'AI field': 1, 'AI research': 2, 'AI techniques': 1, 'Artificial intelligence': 1, 'Artificial intelligence AI': 1, 'Chess': 1, 'Colloquially': 1, 'Currently popular approaches include statistical methods': 1, 'Modern examples of AI include computers': 1, 'ability': 1, 'accomplishment of particular applications': 1, 'among': 1, 'applied': 1, 'approaches': 1, 'arbitrary': 1, 'artificial psychology': 1, 'associates artificial intelligence': 1, 'beat professional players': 1, 'cars': 1, 'central': 1, 'central property of humans': 1, 'challenging': 1, 'chance of success': 1, 'claim': 1, 'colloquial connotation': 1, 'communicate': 1, 'competently perform': 1, 'computational intelligence': 1, 'computer': 1, 'computer science': 1, 'constitutes artificial intelligence tends': 1, 'cultural factors': 1, 'deeply divided': 1, 'divided': 1, 'division': 1, 'due': 1, 'economics': 1, 'environment': 1, 'especially among': 1, 'essential': 1, 'ethics of creating artificial beings endowed': 1, 'even mysterious': 1, 'example': 1, 'exemplar of artificial intelligence': 1, 'explored': 1, 'fail': 1, 'fiction': 1, 'field': 1, "field's": 1, 'flexible rational agent': 1, 'focus': 1, 'founded': 1, 'goals of AI research include reasoning': 1, 'grown': 1, 'heavy lifting': 1, 'highly technical': 1, 'human intelligence—the sapience of Homo sapiens sapiens—can': 1, 'human minds': 1, 'ideal intelligent machine': 1, 'including computer science': 1, 'including versions of search': 1, 'individual researchers': 1, 'intelligence': 2, 'intelligence exhibited': 1, 'interdisciplinary': 1, 'intuitively associate': 1, 'issues': 1, 'knowledge': 1, 'learning': 2, 'linguistics': 1, 'logic': 1, 'machine': 2, 'machines': 2, 'manipulate': 1, 'mathematical optimization': 1, 'mathematics': 1, 'maximize': 1, 'methods based': 1, 'mimic cognitive functions': 1, 'mind': 1, 'move': 1, 'mundane routine': 1, 'myth': 1, 'natural language processing communication': 1, 'nature': 1, 'navigate crowded city streets': 1, 'neuroscience': 1, 'nowadays': 1, 'optical character recognition': 1, 'particular institutions': 1, 'particular tool': 1, 'perceived': 1, 'perceives': 1, 'perception': 1, 'philosophy': 2, 'planning': 1, 'precisely described': 1, 'probability': 1, 'professions converge': 1, 'providing': 1, 'psychology': 1, 'public': 1, 'raises philosophical arguments': 1, 'sciences': 1, 'shrink': 1, 'simulate': 1, 'social': 1, 'solution of specific': 1, 'specialized': 1, 'specialized fields': 1, 'subfields': 2, 'subfields focus': 1, 'subject of tremendous optimism9': 1, 'subjective borderline': 1, 'suffered stunning': 1, 'takes actions': 1, 'technical issues': 1, 'techniques': 1, 'technology industry': 1, 'term artificial intelligence': 1, 'time': 1, 'tools': 1, 'towards': 1, 'traditional symbolic AI': 1}
lr = termextract.core.score_lr(
frequency,
ignore_words=termextract.english_plaintext.IGNORE_WORDS,
lr_mode=1, average_rate=1)
pprint(lr)
{'AI': 5.916079783099616, 'AI field': 2.892507608519078, 'AI research': 4.090623489235047, 'AI techniques': 2.892507608519078, 'Artificial intelligence': 3.1301691601465746, 'Artificial intelligence AI': 3.8701091424450826, 'Chess': 1.0, 'Colloquially': 1.0, 'Currently popular approaches include statistical methods': 2.1189261887185906, 'Modern examples of AI include computers': 2.13480888082866, 'ability': 1.0, 'accomplishment of particular applications': 1.5422108254079407, 'among': 1.4142135623730951, 'applied': 1.0, 'approaches': 2.0, 'arbitrary': 1.0, 'artificial psychology': 3.027400104035091, 'associates artificial intelligence': 3.7288210710016374, 'beat professional players': 1.5874010519681994, 'cars': 1.0, 'central': 1.4142135623730951, 'central property of humans': 1.4142135623730951, 'challenging': 1.0, 'chance of success': 1.2599210498948732, 'claim': 1.0, 'colloquial connotation': 1.4142135623730951, 'communicate': 1.0, 'competently perform': 1.4142135623730951, 'computational intelligence': 2.8284271247461903, 'computer': 2.449489742783178, 'computer science': 2.0597671439071177, 'constitutes artificial intelligence tends': 2.9262229190053666, 'cultural factors': 1.4142135623730951, 'deeply divided': 1.4142135623730951, 'divided': 1.4142135623730951, 'division': 1.0, 'due': 1.0, 'economics': 1.0, 'environment': 1.0, 'especially among': 1.4142135623730951, 'essential': 1.0, 'ethics of creating artificial beings endowed': 1.9310155543137495, 'even mysterious': 1.4142135623730951, 'example': 1.0, 'exemplar of artificial intelligence': 2.6833582480460967, 'explored': 1.0, 'fail': 1.0, 'fiction': 1.0, 'field': 1.4142135623730951, "field's": 1.0, 'flexible rational agent': 1.5874010519681994, 'focus': 1.4142135623730951, 'founded': 1.0, 'goals of AI research include reasoning': 2.261751222748436, 'grown': 1.0, 'heavy lifting': 1.4142135623730951, 'highly technical': 1.681792830507429, 'human intelligence—the sapience of Homo sapiens sapiens—can': 1.6888822547634152, 'human minds': 1.5650845800732873, 'ideal intelligent machine': 1.5874010519681994, 'including computer science': 1.9441612972396656, 'including versions of search': 1.4877378261644902, 'individual researchers': 1.4142135623730951, 'intelligence': 5.656854249492381, 'intelligence exhibited': 2.8284271247461903, 'interdisciplinary': 1.0, 'intuitively associate': 1.4142135623730951, 'issues': 1.4142135623730951, 'knowledge': 1.0, 'learning': 1.0, 'linguistics': 1.0, 'logic': 1.0, 'machine': 1.4142135623730951, 'machines': 1.0, 'manipulate': 1.0, 'mathematical optimization': 1.4142135623730951, 'mathematics': 1.0, 'maximize': 1.0, 'methods based': 1.681792830507429, 'mimic cognitive functions': 1.5874010519681994, 'mind': 1.0, 'move': 1.0, 'mundane routine': 1.4142135623730951, 'myth': 1.0, 'natural language processing communication': 1.681792830507429, 'nature': 1.0, 'navigate crowded city streets': 1.681792830507429, 'neuroscience': 1.0, 'nowadays': 1.0, 'optical character recognition': 1.5874010519681994, 'particular institutions': 2.0, 'particular tool': 2.0, 'perceived': 1.0, 'perceives': 1.0, 'perception': 1.0, 'philosophy': 1.0, 'planning': 1.0, 'precisely described': 1.4142135623730951, 'probability': 1.0, 'professions converge': 1.4142135623730951, 'providing': 1.0, 'psychology': 1.4142135623730951, 'public': 1.0, 'raises philosophical arguments': 1.5874010519681994, 'sciences': 1.0, 'shrink': 1.0, 'simulate': 1.0, 'social': 1.0, 'solution of specific': 1.2599210498948732, 'specialized': 1.4142135623730951, 'specialized fields': 1.4142135623730951, 'subfields': 1.4142135623730951, 'subfields focus': 1.4142135623730951, 'subject of tremendous optimism9': 1.4142135623730951, 'subjective borderline': 1.4142135623730951, 'suffered stunning': 1.4142135623730951, 'takes actions': 1.4142135623730951, 'technical issues': 1.681792830507429, 'techniques': 1.4142135623730951, 'technology industry': 1.4142135623730951, 'term artificial intelligence': 3.7288210710016374, 'time': 1.0, 'tools': 1.0, 'towards': 1.0, 'traditional symbolic AI': 2.557759296802023}
term_imp = termextract.core.term_importance(frequency, lr)
pprint(term_imp)
{'AI': 5.916079783099616, 'AI field': 2.892507608519078, 'AI research': 8.181246978470094, 'AI techniques': 2.892507608519078, 'Artificial intelligence': 3.1301691601465746, 'Artificial intelligence AI': 3.8701091424450826, 'Chess': 1.0, 'Colloquially': 1.0, 'Currently popular approaches include statistical methods': 2.1189261887185906, 'Modern examples of AI include computers': 2.13480888082866, 'ability': 1.0, 'accomplishment of particular applications': 1.5422108254079407, 'among': 1.4142135623730951, 'applied': 1.0, 'approaches': 2.0, 'arbitrary': 1.0, 'artificial psychology': 3.027400104035091, 'associates artificial intelligence': 3.7288210710016374, 'beat professional players': 1.5874010519681994, 'cars': 1.0, 'central': 1.4142135623730951, 'central property of humans': 1.4142135623730951, 'challenging': 1.0, 'chance of success': 1.2599210498948732, 'claim': 1.0, 'colloquial connotation': 1.4142135623730951, 'communicate': 1.0, 'competently perform': 1.4142135623730951, 'computational intelligence': 2.8284271247461903, 'computer': 2.449489742783178, 'computer science': 2.0597671439071177, 'constitutes artificial intelligence tends': 2.9262229190053666, 'cultural factors': 1.4142135623730951, 'deeply divided': 1.4142135623730951, 'divided': 1.4142135623730951, 'division': 1.0, 'due': 1.0, 'economics': 1.0, 'environment': 1.0, 'especially among': 1.4142135623730951, 'essential': 1.0, 'ethics of creating artificial beings endowed': 1.9310155543137495, 'even mysterious': 1.4142135623730951, 'example': 1.0, 'exemplar of artificial intelligence': 2.6833582480460967, 'explored': 1.0, 'fail': 1.0, 'fiction': 1.0, 'field': 1.4142135623730951, "field's": 1.0, 'flexible rational agent': 1.5874010519681994, 'focus': 1.4142135623730951, 'founded': 1.0, 'goals of AI research include reasoning': 2.261751222748436, 'grown': 1.0, 'heavy lifting': 1.4142135623730951, 'highly technical': 1.681792830507429, 'human intelligence—the sapience of Homo sapiens sapiens—can': 1.6888822547634152, 'human minds': 1.5650845800732873, 'ideal intelligent machine': 1.5874010519681994, 'including computer science': 1.9441612972396656, 'including versions of search': 1.4877378261644902, 'individual researchers': 1.4142135623730951, 'intelligence': 11.313708498984761, 'intelligence exhibited': 2.8284271247461903, 'interdisciplinary': 1.0, 'intuitively associate': 1.4142135623730951, 'issues': 1.4142135623730951, 'knowledge': 1.0, 'learning': 2.0, 'linguistics': 1.0, 'logic': 1.0, 'machine': 2.8284271247461903, 'machines': 2.0, 'manipulate': 1.0, 'mathematical optimization': 1.4142135623730951, 'mathematics': 1.0, 'maximize': 1.0, 'methods based': 1.681792830507429, 'mimic cognitive functions': 1.5874010519681994, 'mind': 1.0, 'move': 1.0, 'mundane routine': 1.4142135623730951, 'myth': 1.0, 'natural language processing communication': 1.681792830507429, 'nature': 1.0, 'navigate crowded city streets': 1.681792830507429, 'neuroscience': 1.0, 'nowadays': 1.0, 'optical character recognition': 1.5874010519681994, 'particular institutions': 2.0, 'particular tool': 2.0, 'perceived': 1.0, 'perceives': 1.0, 'perception': 1.0, 'philosophy': 2.0, 'planning': 1.0, 'precisely described': 1.4142135623730951, 'probability': 1.0, 'professions converge': 1.4142135623730951, 'providing': 1.0, 'psychology': 1.4142135623730951, 'public': 1.0, 'raises philosophical arguments': 1.5874010519681994, 'sciences': 1.0, 'shrink': 1.0, 'simulate': 1.0, 'social': 1.0, 'solution of specific': 1.2599210498948732, 'specialized': 1.4142135623730951, 'specialized fields': 1.4142135623730951, 'subfields': 2.8284271247461903, 'subfields focus': 1.4142135623730951, 'subject of tremendous optimism9': 1.4142135623730951, 'subjective borderline': 1.4142135623730951, 'suffered stunning': 1.4142135623730951, 'takes actions': 1.4142135623730951, 'technical issues': 1.681792830507429, 'techniques': 1.4142135623730951, 'technology industry': 1.4142135623730951, 'term artificial intelligence': 3.7288210710016374, 'time': 1.0, 'tools': 1.0, 'towards': 1.0, 'traditional symbolic AI': 2.557759296802023}
import collections
data_collection = collections.Counter(term_imp)
for cmp_noun, value in data_collection.most_common():
print(cmp_noun, value, sep="\t")
intelligence 11.313708498984761 AI research 8.181246978470094 AI 5.916079783099616 Artificial intelligence AI 3.8701091424450826 term artificial intelligence 3.7288210710016374 associates artificial intelligence 3.7288210710016374 Artificial intelligence 3.1301691601465746 artificial psychology 3.027400104035091 constitutes artificial intelligence tends 2.9262229190053666 AI techniques 2.892507608519078 AI field 2.892507608519078 subfields 2.8284271247461903 computational intelligence 2.8284271247461903 intelligence exhibited 2.8284271247461903 machine 2.8284271247461903 exemplar of artificial intelligence 2.6833582480460967 traditional symbolic AI 2.557759296802023 computer 2.449489742783178 goals of AI research include reasoning 2.261751222748436 Modern examples of AI include computers 2.13480888082866 Currently popular approaches include statistical methods 2.1189261887185906 computer science 2.0597671439071177 philosophy 2.0 particular institutions 2.0 learning 2.0 particular tool 2.0 machines 2.0 approaches 2.0 including computer science 1.9441612972396656 ethics of creating artificial beings endowed 1.9310155543137495 human intelligence—the sapience of Homo sapiens sapiens—can 1.6888822547634152 technical issues 1.681792830507429 methods based 1.681792830507429 navigate crowded city streets 1.681792830507429 highly technical 1.681792830507429 natural language processing communication 1.681792830507429 mimic cognitive functions 1.5874010519681994 flexible rational agent 1.5874010519681994 raises philosophical arguments 1.5874010519681994 ideal intelligent machine 1.5874010519681994 optical character recognition 1.5874010519681994 beat professional players 1.5874010519681994 human minds 1.5650845800732873 accomplishment of particular applications 1.5422108254079407 including versions of search 1.4877378261644902 psychology 1.4142135623730951 subfields focus 1.4142135623730951 takes actions 1.4142135623730951 suffered stunning 1.4142135623730951 subjective borderline 1.4142135623730951 deeply divided 1.4142135623730951 cultural factors 1.4142135623730951 field 1.4142135623730951 heavy lifting 1.4142135623730951 professions converge 1.4142135623730951 colloquial connotation 1.4142135623730951 mathematical optimization 1.4142135623730951 technology industry 1.4142135623730951 individual researchers 1.4142135623730951 focus 1.4142135623730951 precisely described 1.4142135623730951 intuitively associate 1.4142135623730951 among 1.4142135623730951 central 1.4142135623730951 issues 1.4142135623730951 specialized fields 1.4142135623730951 competently perform 1.4142135623730951 techniques 1.4142135623730951 subject of tremendous optimism9 1.4142135623730951 divided 1.4142135623730951 especially among 1.4142135623730951 specialized 1.4142135623730951 even mysterious 1.4142135623730951 central property of humans 1.4142135623730951 mundane routine 1.4142135623730951 solution of specific 1.2599210498948732 chance of success 1.2599210498948732 Chess 1.0 providing 1.0 field's 1.0 claim 1.0 sciences 1.0 nature 1.0 arbitrary 1.0 fiction 1.0 perceives 1.0 towards 1.0 challenging 1.0 example 1.0 applied 1.0 interdisciplinary 1.0 perception 1.0 communicate 1.0 explored 1.0 neuroscience 1.0 essential 1.0 environment 1.0 planning 1.0 founded 1.0 manipulate 1.0 logic 1.0 due 1.0 ability 1.0 mind 1.0 time 1.0 tools 1.0 division 1.0 myth 1.0 shrink 1.0 economics 1.0 knowledge 1.0 Colloquially 1.0 social 1.0 simulate 1.0 perceived 1.0 grown 1.0 maximize 1.0 fail 1.0 linguistics 1.0 nowadays 1.0 public 1.0 cars 1.0 mathematics 1.0 move 1.0 probability 1.0