Java Performance Extreme Programming

2018/05/19

Java性能极限编程(转载)

不同位数的Hash算法的碰撞概率

FNV_1a_64算法

FNV算法,其改进版本是fnv_1a,fnv是一种非常简洁并有效的hash算法,具体实现如下:

public static long fnv1a_64(String chars) {
    long hash = 0xcbf29ce484222325L;
    for (int i = 0; i < chars.length(); ++i) {
        char c = chars.charAt(i);
        hash ^= c;
        hash *= 0x100000001b3L;
    }

    return hash;
}

在上面的代码看来,可以做增量计算。增量计算可以用来很多场景优化,比如class的fullName由package和simpleName组成,如果已经计算了package的hashCode,就是在此基础上计算fullName的hashCode,不用从头开始计算,这在很多场景可以优化性能。

增量计算的实现

public static long fnv1a_64_increment(long basic, char seperator, String chars) {
    long hashCode = basic;
    
    // 补上分隔符的Hash增量
    hashCode ^= seperator;
    hashCode *= 0x100000001b3L;

    // 补上SimpleName的Hash增量
    for (int i = 0; i < chars.length(); ++i) {
        char ch = chars.charAt(i);
        hashCode ^= ch;
        hashCode *= 0x100000001b3L;
    }

    return hashCode;
}

// 先计算package的Hash
long pkgHashCode64 = fnv1a_64("com.alibaba.xxx");

// 增量计算fullName的HashCode
long fullNameHashCode64 = fnv1a_64_increment(pkgHashCode64, '.', "MyClassName");

大小写无关的FNV Hash算法实现

在SQL分析等场景,为了更好的用户体验,对象名查找需要大小写不敏感。

public static long fnv1a_64_lower(String key) {
    long hashCode = 0xcbf29ce484222325L;;
    for (int i = 0; i < key.length(); ++i) {
        char ch = key.charAt(i);

        // 所有大写都转成小写
        if (ch >= 'A' && ch <= 'Z') {
            ch = (char) (ch + 32);
        }

        hashCode ^= ch;
        hashCode *= 0x100000001b3L;
    }

    return hashCode;
}

二分查找和fnv_1a_hash_64组合使用

String[] strings = { "AVG", "COUNT", "MAX", "MIN", "STDDEV", "SUM" };

// 计算hashCodes并且排序
long[] hashCodes = new long[strings.length];
for (int i = 0; i < strings.length; i++) {
    hashCodes[i] = fnv1a_64_lower(strings[i]);
}
Arrays.sort(hashCodes);

// 建立根据hashCode排序的映射关系
String[] keywords = new String[hashCodes.length];
for (String str : strings) {
    long hash = FnvHash.fnv1a_64_lower(str);
    int index = Arrays.binarySearch(hashCodes, hash);
    keywords[index] = str;
}

// 根据hashCode二分查找到位置,然后获取
public String getKeyword(long hash) {
    int index = Arrays.binarySearch(hashCodes, hash);
    if (index < 0) {
        return null;
    }
    return keywords[index];
}

替换hashCode和equals实现

class SQLIdentifierExpr {
    private String    name;
    private long      hashCode64;
    
    public SQLIdentifierExpr(String name, long hash_lower){
        this.name = name;
        this.hashCode64 = hash_lower;
    }
    
    @Override
    public int hashCode() {
        long value = hashCode64;
        return (int)(value ^ (value >>> 32));
    }
    
    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof SQLIdentifierExpr)) {
            return false;
        }
    
        SQLIdentifierExpr other = (SQLIdentifierExpr) obj;
        
        // hashCode64相等就假设相等
        return this.hashCode64 == other.hashCode64; 
    }
}

应用举例 使用HashCode64替换字符串比较

在Druid SQL Parser中,抽象与发分析时,使用了通过比较hashCode64替代字符串比较,极大提升语义分析的性能。

public class SQLExprTableSource extends SQLTableSourceImpl {

    protected SQLExpr     expr;
    public SQLTableSource findTableSource(long alias_hash) {
        if (alias_hash == 0) {
            return null;
        }

        if (aliasHashCode64() == alias_hash) {
            return this;
        }

        if (expr instanceof SQLName) {
            long exprNameHash = ((SQLName) expr).nameHashCode64();
            if (exprNameHash == alias_hash) {
                return this;
            }
        }

        if (expr instanceof SQLPropertyExpr) {
            long hash = ((SQLPropertyExpr) expr).hashCode64();
            if (hash == alias_hash) {
                return this;
            }
        }

        return null;
    }
} 

应用举例 fastjson android版本属性名快速匹配

// JSON字符串
{”id”:1001,”name”:”wenshao”}

// 类定义
public class Person {
   public int id;
   public String name;
}

// 读取属性名的fnv_1a_hashCode64
public long readNameHash() {
    long hash = 0x811c9dc5;
    for (; i < text.length; ++p) {
        char ch = text.charAt(p);
        if (ch == '"') break;
            hash ^= ch;
            hash *= 0x1000193;
        }
    return hash;
}

// 把fieldName hash读取出来,查找对应的FieldDeserializer
// 避免了构造fieldName的字符串,减少内存分配,也提升了fndFieldDeserializer的性能
long nameHash = readNameHash();
FieldDeserializer fieldDeser = fndFieldDeserializer(nameHash);
if (fieldDeser != null) {
   fieldDeser.readValue();
}
// 以上为伪码,真实实现会复杂很多

更多应用场景