Diferencia entre revisiones de «Índice de Czekanowski»
Ir a la navegación
Ir a la búsqueda
Línea 6: | Línea 6: | ||
d = spp ausentes en común | d = spp ausentes en común | ||
+ | [http://people.revoledu.com/kardi/tutorial/Similarity/SimpleMatching.html Tutorial sobre medición de similitud] | ||
+ | [http://media.wiley.com/product_data/excerpt/61/04714696/0471469661.pdf Tutorial incompleto, buscar referencia. Explicación simple sobre índices de similitud y distancia] | ||
+ | <!-- | ||
+ | With binary variables, we traditionally focus on the notion of similarity rather | ||
+ | than distance (or dissimilarity). Consider two binary vectors x and y that consist | ||
+ | of two strings [xk], [yk] of binary data; compare them coordinatewise and do the | ||
+ | simple counting of occurrences: | ||
+ | number of occurrences when xk and yk are both equal to 1 | ||
+ | number of occurrences when xk = 0 and yk = 1 | ||
+ | number of occurrences when xk = 1 and yk = 0 | ||
+ | number of occurrences when xk and yk are both equal to 0 | ||
+ | |||
+ | These four numbers can be organized in a 2 by 2 co-occurrence matrix (contingency | ||
+ | table) that visualizes how “close” these two strings are to each other. | ||
+ | ::1 0 | ||
+ | ::1 a b | ||
+ | ::0 c d | ||
+ | Evidently the zero nondiagonal entries of this matrix point at the ideal matching | ||
+ | (the highest similarity). Based on these four entries, there are several commonly | ||
+ | encountered measure of similarity of binary vectors x and y. The simplest matching coefficient computes as the following ratio: | ||
+ | a + d | ||
+ | a + b + c + d | ||
+ | (1.4) | ||
+ | The Russell and Rao measure of similarity consists of the quotient | ||
+ | a | ||
+ | a + b + c + d | ||
+ | (1.5) | ||
+ | The Jacard index involves the case when both inputs assume values equal to 1: | ||
+ | a | ||
+ | a + b + c | ||
+ | (1.6) | ||
+ | The Czekanowski index is practically the same as the Jacard index, but by adding | ||
+ | the weight factor of 2, it emphasizes the coincidence of situations where entries | ||
+ | of x and y both assume values equal to 1: | ||
+ | 2a | ||
+ | 2a + b + c | ||
+ | (1.7) | ||
+ | --> | ||
[[Categoría:Glosario]] [[Categoría:Esbozo]] | [[Categoría:Glosario]] [[Categoría:Esbozo]] |
Revisión del 20:35 1 oct 2006
Cz =
a = ssp comunes b = exclusivas de grupo 1 c = exclusivas de grupo 2 d = spp ausentes en común
Tutorial sobre medición de similitud
Tutorial incompleto, buscar referencia. Explicación simple sobre índices de similitud y distancia