Dear, If you need CCM program, there 's one on http://groups.csail.mit.edu/rbg/code/multiling_induction/. However, because this code was written in OCaml and I found that it was quite difficult to compile it (even I had to fix some code lines and 'infer' the structs of configuration file and makefile). Hence, I hope you could save a lot of time. 1. install godi (http://godi.camlcity.org/godi/index.html) 2. install GSL - GNU Scientific Library 3. run godi_console, then install the following packages [144] godi-ocaml 3.11.2 3.11.2 The core of the OCaml system [ 40] base-pcre 7.7#1 7.7#1 The version of PCRE for GODI [ 62] conf-pcre 6 6 Configures which pcre library [ 81] godi-batteries 1.3.0 1.3.0 a community-maintained founda [ 95] godi-camomile 0.7.1#7 0.7.1#7 Camomile is a comprehensive [111] godi-extlib 1.5.1 1.5.1 User-supported Extended Stand [114] godi-findlib 1.2.7 1.2.7 The findlib/ocamlfind package [166] godi-ocamlgsl 0.6.2#1 0.6.2#1 GSL bindings for OCaml [167] godi-ocamlmakefile 6.29.3#1 6.29.3#1 Generic Makefile to build OCa [186] godi-pcre 6.1.0#1 6.1.0#1 Perl compatible regular expre [200] godi-tools 2.0.15 2.0.15 godi_console and other tools 4. in the attached zip file, you can find executable files ccm and ccm_gibbs which were compiled on Ubuntu 10.10. You could compile the source code by running makefile - 1. compile myDynArray - 2. compile hashSet - 3. compile util - 4. compile ccm, ccm_gibbs please change the links in the makefiles to yours. 5. prepare data. I wrote a small tool for this step. In folder 'corpus', there are two WSJ corpora: treebank and right-branching. The latter is used for initializing CCM if you want. To complete this step, just call runme.sh. You will find four files in the folder 'data' - poses and brackets : POS and bracket sequences from the right-branching corpus - test_poses and test_brackets : POS and bracket sequences from the WSJ treebank corpus 6. run ccm / ccm_gibbs. Taking a look at file 'config'. The structure is senlen [int] // maximum sentence length testlen [int] // maximum test sentence length litmit [int] // maximum number of sentence dir [string] // the dir path of the data poses [string] // POS sequence file - for initializing brackets [string] // bracket sequence file - for initializing test_poses [string] // POS sequence file - for testing brackets [string] // bracket sequence file - for testing restarts [int] // times running test (you could run the test many time for doing some statistics) addc [int] // I think that it 's for smoothing (please read the second paragraph, page 1413, // Klein and Manning addd [int] // To run it, just call './ccm config' The F-score should be about 67%. It's lower than the result reported in the Klein's paper (71%). I haven't found out the reason yet. Best, Phong