Put Cabocha 0.68 on Windows and try to analyze the dependency with Python

Purpose

Install Cabocha 0.68 and perform dependency analysis in Python

Premise

Have Mecab installed https://code.google.com/p/mecab/downloads/list

Here, it is assumed that mecab-0.996.exe is installed in UTF-8.

Install Cabocha

  1. Download cabocha-0.68.exe  http://code.google.com/p/cabocha/downloads/list

  2. Execute the downloaded EXE. At this time, the character code to be selected should be the same as the character code of Mecab.

  1. Make it possible to execute cabocha through "C: \ Program Files (x86) \ CaboCha \ bin" in the path of the environment variable. It is also needed for python to access the dll.

  2. Confirm execution Create a UTF8 file called input.txt, enter the character string you want to analyze, and execute the following from the command prompt.

cabocha < input.txt > out.txt

If it can be analyzed properly, the following file will be output.


here---D
Marisa's-D
It's a slow place!
EOS

Note that the reason why the file is routed here is that UTF-8 cannot be handled at the command prompt.

This is Marisa's slow place!
EOS

If this happens, the character code of input.txt may not be utf-8. (Note that the default is ANSI when created with Notepad)

In addition, the following error may occur.

svm.cpp(140) [version == MODEL_VERSION] incompatible version: 101
svm.cpp(751) [size >= 2] dep.cpp(79) [!failed] no such file or directory: C:\Program Files (x86)\CaboCha\etc\..\model\dep.ipa.model

In this case, the version of cabocha has not been upgraded properly, so delete the following folder.

C:\Users\User name\AppData\Local\VirtualStore\Program Files(x86)\CaboCha

Make Cabocha available from Python.

  1. Download cabocha-0.68.tar.bz  http://code.google.com/p/cabocha/downloads/list This file can be decompressed with Lhaplus etc.

  2. Move the current directory to the python folder in the unzipped folder and execute the following command.

python setup.py install
  1. The following error occurs.
Traceback (most recent call last):
  File "setup.py", line 13, in <module>
    version = cmd1("cabocha-config --version"),
  File "setup.py", line 7, in cmd1
    return os.popen(str).readlines()[0][:-1]
IndexError: list index out of range

This happens because cabocha-config is not installed on Windows

  1. Modify setup.py.

Change before


#!/usr/bin/env python

from distutils.core import setup,Extension,os
import string

def cmd1(str):
    return os.popen(str).readlines()[0][:-1]

def cmd2(str):
    return string.split (cmd1(str))

setup(name = "cabocha-python",
	version = cmd1("cabocha-config --version"),
	py_modules=["CaboCha"],
	ext_modules = [
		Extension("_CaboCha",
			["CaboCha_wrap.cxx",],
			include_dirs=cmd2("cabocha-config --inc-dir"),
			library_dirs=cmd2("cabocha-config --libs-only-L"),
			libraries=cmd2("cabocha-config --libs-only-l"))
			])


Rewrite version and the contents of ext_modules with the installed information.

After change


#!/usr/bin/env python

from distutils.core import setup,Extension,os
import string

def cmd1(str):
    return os.popen(str).readlines()[0][:-1]

def cmd2(str):
    return string.split (cmd1(str))

setup(name = "cabocha-python",
	version = "0.68",
	py_modules=["CaboCha"],
	ext_modules = [
		Extension("_CaboCha",
			["CaboCha_wrap.cxx",],
			include_dirs=[r"C:\Program Files (x86)\CaboCha\sdk"],
			library_dirs=[r"C:\Program Files (x86)\CaboCha\sdk"],
			libraries=['libcabocha'])
])
  1. Run setup.py again
python setup.py install
  1. Enter the following sample program and try it.
#!/usr/bin/python
# -*- coding: utf-8 -*-

import CaboCha

# c = CaboCha.Parser("");
c = CaboCha.Parser("")

sentence = "Return the hat"

#print c.parseToString(sentence)

#tree =  c.parse(sentence)
#
tree =  c.parse(sentence)
print tree.toString(CaboCha.FORMAT_TREE)
print tree.toString(CaboCha.FORMAT_LATTICE)
#print tree.toString(CaboCha.FORMAT_XML)

for i in range(tree.chunk_size()):
    chunk = tree.chunk(i)
    print 'Chunk:', i
    print ' Score:', chunk.score
    print ' Link:', chunk.link
    print ' Size:', chunk.token_size
    print ' Pos:', chunk.token_pos
    print ' Head:', chunk.head_pos #Head
    print ' Func:', chunk.func_pos #Function words
    print ' Features:',
    for j in range(chunk.feature_list_size):
        print '  ' + chunk.feature_list(j) 
    print
    print 'Text' 
    for ix  in range(chunk.token_pos,chunk.token_pos + chunk.token_size):
      print ' ', tree.token(ix).surface 
    print

for i in range(tree.token_size()):
    token = tree.token(i)
    print 'Surface:', token.surface
    print ' Normalized:', token.normalized_surface
    print ' Feature:', token.feature
    print ' NE:', token.ne #Named entity
    print ' Info:', token.additional_info
    print ' Chunk:', token.chunk
    print

Hat-D
return
EOS

* 0 1D 0/1 0.000000
Hat noun,General,*,*,*,*,hat,Bow,Boshi
Particles,Case particles,General,*,*,*,To,Wo,Wo
* 1 -1D 0/0 0.000000
Verb to return,Independence,*,*,Godan / Sa line,Uninflected word,return,Kaes,Kaes
EOS

Chunk: 0
 Score: 0.0
 Link: 1
 Size: 2
 Pos: 0
 Head: 0
 Func: 1
 Features:   FCASE:To
  FHS:hat
  FHP0:noun
  FHP1:General
  FFS:To
  FFP0:Particle
  FFP1:Case particles
  FFP2:General
  FLS:hat
  FLP0:noun
  FLP1:General
  FRS:To
  FRP0:Particle
  FRP1:Case particles
  FRP2:General
  LF:To
  RL:hat
  RH:hat
  RF:To
  FBOS:1
  GCASE:To
  A:To

Text
hat
To

Chunk: 1
 Score: 0.0
 Link: -1
 Size: 1
 Pos: 2
 Head: 0
 Func: 0
 Features:   FHS:return
  FHP0:verb
  FHP1:Independence
  FHF:Uninflected word
  FFS:return
  FFP0:verb
  FFP1:Independence
  FFF:Uninflected word
  FLS:return
  FLP0:verb
  FLP1:Independence
  FLF:Uninflected word
  FRS:return
  FRP0:verb
  FRP1:Independence
  FRF:Uninflected word
  LF:return
  RL:return
  RH:return
  RF:return
  FEOS:1
  A:Uninflected word

Text
return

Surface:hat
 Normalized:hat
 Feature:noun,General,*,*,*,*,hat,Bow,Boshi
 NE: None
 Info: None
 Chunk: <CaboCha.Chunk; proxy of <Swig Object of type 'CaboCha::Chunk *' at 0x0274A170> >

Surface:To
 Normalized:To
 Feature:Particle,Case particles,General,*,*,*,To,Wo,Wo
 NE: None
 Info: None
 Chunk: None

Surface:return
 Normalized:return
 Feature:verb,Independence,*,*,Godan / Sa line,Uninflected word,return,Kaes,Kaes
 NE: None
 Info: None
 Chunk: <CaboCha.Chunk; proxy of <Swig Object of type 'CaboCha::Chunk *' at 0x0274A170> >


Recommended Posts

Put Cabocha 0.68 on Windows and try to analyze the dependency with Python
Python 3.6 on Windows ... and to Xamarin.
Put MicroPython on Windows to run ESP32 on Python
Try to solve the man-machine chart with Python
Install OpenCV 4.0 and Python 3.7 on Windows 10 with Anaconda
Put MeCab binding for Python with pip on Windows, mac and Linux
Introduction to Python with Atom (on the way)
How to get started with the 2020 Python project (windows wsl and mac standardization)
Try to solve the programming challenge book with python3
Try to solve the internship assignment problem with Python
Install selenium on Mac and try it with python
Try to operate DB with Python and visualize with d3
Try running the toio core cube on Windows 10 / macOS / Linux with the Python library bleak
Try to bring up a subwindow with PyQt5 and Python
Try to automate the operation of network devices with Python
Save images on the web to Drive with Python (Colab)
I tried changing the python script from 2.7.11 to 3.6.0 on windows10
Try to decipher the garbled attachment file name with Python
Use Python to monitor Windows and Mac and collect information on the apps you are working on
Try to operate Facebook with Python
Integrate Modelica and Python on Windows
Mecab / Cabocha / KNP on Python + Windows
Getting started with Python 3.8 on Windows
Put Ubuntu in Raspi, put Docker on it, and control GPIO with python from the container
Build a 64-bit Python 2.7 environment with TDM-GCC and MinGW-w64 on Windows 7
Linking Python and Arduino to display IME On / Off with LED
Quickly install OpenCV 2.4 (+ python) on OS X and try the sample
How to use python put in pyenv on macOS with PyCall
Disguise the grass on GitHub and try to become an engineer.
Try to display google map and geospatial information authority map with python
The strongest way to use MeCab and CaboCha with Google Colab
The road to installing Python and Flask on an offline PC
Try to implement and understand the segment tree step by step (python)
Try to solve the shortest path with Python + NetworkX + social data
[Python] Try to recognize characters from images with OpenCV and pyocr
Try to reproduce color film with Python
From Python to using MeCab (and CaboCha)
Run servo with Python on ESP32 (Windows)
[Kivy] How to install Kivy on Windows [Python]
Fractal to make and play with Python
A memo with Python2.7 and Python3 on CentOS
I want to analyze logs with Python
Download files on the web with Python
Tweet analysis with Python, Mecab and CaboCha
Python amateurs try to summarize the list ①
The road to compiling to Python 3 with Thrift
[Python] I tried to visualize the night on the Galactic Railroad with WordCloud!
Connect to VPN with your smartphone and turn off / on the server
Try converting latitude / longitude and world coordinates to each other with python
How to get the date and time difference in seconds with python
Add 95% confidence intervals on both sides to the diagram with Python / Matplotlib
Use python on Raspberry Pi 3 to light the LED with switch control!
Try to make foldl and foldr with Python: lambda. Also time measurement
Try to image the elevation data of the Geographical Survey Institute with Python
Work memo to migrate and update Python 2 series scripts on the cloud to 3 series
Steps to create a Python virtual environment with VS Code on Windows
Try to build python and anaconda environment on Mac (by pyenv, conda)
Try to write python code to generate go code --Try porting JSON-to-Go and so on
I tried with the top 100 PyPI packages> I tried to graph the packages installed on Python
I tried to verify and analyze the acceleration of Python by Cython
Read the file with python and delete the line breaks [Notes on reading the file]