faults on coding (to be continued)

零零散散的在编码过程中犯过一些错误,所以打算专门写一篇总结,持续记录这些错误,以便知错能改。

1 work一直良好的系统中,明显的错误不要改,例如:

private int age;
public void setAge(int age){
 age = age;
}

有些功能实际上错上加错的结果,当纠正一个错误的时候,结果就变成了错误。

2 理所当然的思维惯性。

下面的结果是什么?是不是true?

	Properties properties = new Properties();
        properties.put("key",true); 
        
	String valueStr = properties.getProperty("key");
	System.out.println(valueStr);
	System.out.println(Boolean.TRUE.toString().equalsIgnoreCase(valueStr));

源码分析:
实际上设置的是boolean型的值,然后getProperty返回的是null:

	    public String getProperty(String key) {
	        Object oval = super.get(key);
	        String sval = (oval instanceof String) ? (String)oval : null;
	        return ((sval == null) && (defaults != null)) ? defaults.getProperty(key) : sval;
	    }

3 针对接口编程和误用方法:

下面的写法试图cache一个值,当不存在对应的value时,自动加载一个值,那下面的值返回什么?

cache loader

  		CacheLoader<String, String> loader = new CacheLoader<String, String>() {

			@Override
			public String load(String key) throws Exception {
 				return key + "'s value";
			}
		};

使用:

		Cache<String, String> cache = CacheBuilder.newBuilder().maximumSize(10_000).expireAfterWrite(15, TimeUnit.MINUTES).build(loader);
  		String value = cache.getIfPresent("key1");

实际正确的写法

  		LoadingCache<String,String> cache = CacheBuilder.newBuilder().maximumSize(10_000).expireAfterWrite(15, TimeUnit.MINUTES).build(loader);
   		String value = cache.get("key1");

问题出在,一定要仔细阅读API的文档,不要想当然,同时针对接口编程,不定是针对顶层接口编程,如果上来就赋予给顶层接口,则后面的方法选择范围就比较小,可能就不假思考。

4 不假思索的认为某种类型是枚举

在判断响应是否是JSON body时,误以为MediaType.APPLICATION_JSON_TYPE肯定被定义成枚举,所以直接用==判断。

 MediaType.APPLICATION_JSON_TYPE == response.getMediaType();

修改:不定是每个感觉应该定义成枚举类型的就会定义成枚举类型,另外没有搞清楚状况前,用equals肯定比==更安全,

MediaType.APPLICATION_JSON_TYPE.equals(response.getMediaType()) 

5 Arrays.aslist返回的list?

下面的代码能编译,但是有没有问题?

		List<String> asList = Arrays.asList("123", "456");
		asList.removeIf(string -> string.equalsIgnoreCase("123"));

结果:

Exception in thread "main" java.lang.UnsupportedOperationException
	at java.util.AbstractList.remove(AbstractList.java:161)
	at java.util.AbstractList$Itr.remove(AbstractList.java:374)
	at java.util.Collection.removeIf(Collection.java:415)
	at com.github.metrics.Metric.main(Metric.java:29)

源码解析:

    @SuppressWarnings("varargs")
    public static <T> List<T> asList(T... a) {
        return new ArrayList<>(a); //此处返回的list是java.util.Arrays.ArrayList<E>,而并不是普通的java.util.ArrayList<E>
    }

而这种list的一些实现并未实现:

    public E remove(int index) {
        throw new UnsupportedOperationException();
    }

思考:可以调用的方法不见得可以work,返回看起来名字一样的,但是不见得是一个。

http protocol (1) – different types of timeout

项目中给http client 4.x设置了各种timeout,但是还是发现一些请求耗费的时间远超过预期,排查一部分请求是由于虚拟机或者GC等情况引起,还有一部分长耗时请求归咎于DNS解析的耗时,所以有必要梳理下http client使用中的各种timeout设置,从而对一个请求完成的最大时间有个正确而清晰的认识:

分析http client中,一次http请求的流程大致有2条主线如下:

(1)从连接池获取到连接->发送http请求->接受http响应

(2)从连接池获取连接,但是没有可用的,此时又可以创建新连接情况下-> DNS解析->建立连接-> TLS握手(假设是https)->发送http请求->接受http响应。

分解上面的2条主线,至少有4种timeout(socket timeout, connection timeout, request connection timeout, dns resolve timeout)影响了一次http请求的完成时间。

(1)从连接池获取连接的最长等待时间
通过复用连接池的连接,可以避免不断频繁开关连接,提高效率,所以从连接池获取连接时,当连接池满时,可以设置等待一定的时间来获取连接,而不是永久等待。

1.1 错误堆栈:

  org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
! at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:226) 
! at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:195) 

1.2 源码解析:

   @Override
    public ConnectionRequest requestConnection(
            final HttpRoute route,
            final Object state) {
        Args.notNull(route, "HTTP route");
        if (this.log.isDebugEnabled()) {
            this.log.debug("Connection request: " + format(route, state) + formatStats(route));
        }
        final Future<CPoolEntry> future = this.pool.lease(route, state, null);  //从连接池获取连接
        return new ConnectionRequest() {

            @Override
            public boolean cancel() {
                return future.cancel(true);
            }

            @Override
            public HttpClientConnection get(
                    final long timeout,
                    final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException {
                return leaseConnection(future, timeout, tunit);
            }

        };

    }
    protected HttpClientConnection leaseConnection(
            final Future<CPoolEntry> future,
            final long timeout,
            final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException {
        final CPoolEntry entry;
        try {
            entry = future.get(timeout, tunit);
            if (entry == null || future.isCancelled()) {
                throw new InterruptedException();
            }
            Asserts.check(entry.getConnection() != null, "Pool entry with no connection");
            if (this.log.isDebugEnabled()) {
                this.log.debug("Connection leased: " + format(entry) + formatStats(entry.getRoute()));
            }
            return CPoolProxy.newProxy(entry);
        } catch (final TimeoutException ex) {
            throw new ConnectionPoolTimeoutException("Timeout waiting for connection from pool");
        }
    }

1.3 设置方法:

	RequestConfig requestConfig = RequestConfig.custom().
 				setConnectionRequestTimeout(2 * 1000).

(2)DNS解析时间

拿到域名建立连接之后,需要进行dns解析,一般dns解析都会设置cache时间,例如5分钟,但是为了防止绑死在某个主机上,都不会永久cache,所以dns解析不可避免,然而dns解析的时间控制除了本身将这个过程异步化外并未提供任何可以设置的时间参数。

2.1 堆栈错误:

Caused by: java.net.UnknownHostException: nebulaik.webex.com.cn: Temporary failure in name resolution
        at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
        at java.net.InetAddress.getAllByName(InetAddress.java:1192)
        at java.net.InetAddress.getAllByName(InetAddress.java:1126)
        at com.webex.dsagent.client.http.DSADnsResolver.resolve(DSADnsResolver.java:24)
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:111)
        at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
        at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:84)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)

2.2 源码分析:

大多请求遇到这种错误,都耗时15秒,原因在于系统配置(/etc/resolv.conf ):

options rotate timeout:5 retries:3
nameserver 10.224.91.88
nameserver 10.224.91.99

诊断方法可以使用dig命令:

dig命令提供了2个参数:

+time=T
Sets the timeout for a query to T seconds. The default timeout is 5 seconds. An attempt to set T to less than 1 will result in a query timeout of 1 second being applied.
+tries=T
Sets the number of times to try UDP queries to server to T instead of the default, 3. If T is less than or equal to zero, the number of tries is silently rounded up to 1.

例子:

[root@centos~]# time dig www.baidu.com +time=1

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.62.rc1.el6_9.4 <<>> www.baidu.com +time=1
;; global options: +cmd
;; connection timed out; no servers could be reached

real  0m3.010s
user 0m0.003s
sys  0m0.007s
[root@centos~]# time dig www.baidu.com +time=5

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.62.rc1.el6_9.4 <<>> www.baidu.com +time=5
;; global options: +cmd
;; connection timed out; no servers could be reached

real  0m15.013s
user 0m0.006s
sys  0m0.005s

另外一个追踪方法和问题实例:解析aaa.com.cn超时了。

dig +trace aaa.com.cn //示例

使用dig +trace定位到:

aaa.com.cn. 600 IN CNAME bbb.com.cn.  //别名
bbb.com.cn. 120 IN NS ns1.dns.com.cn.
bbb.com.cn. 120 IN NS ns2.dns.com.cn.

其中ns1.dns.com.cn和ns2.dns.com.cn负责解析这个域名,但是做了环境迁移,导致不可达。后来修改下这2条记录,就不再超时了。

3.3 设置方法:

要不修改配置文件(options rotate timeout:5 retries:3),要不将dns解析异步化。

附上各种DNS错误:
https://zhuanlan.zhihu.com/p/40659713

(3)连接建立时间

3.1 错误的堆栈:

Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to demosite.yy.zz:443 failed: connect timed out
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:150)
        at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
        at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:84)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at org.jboss.resteasy.client.jaxrs.engines.ApacheHttpClient4Engine.invoke(ApacheHttpClient4Engine.java:312)
        ... 69 more
Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:337)
        at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
        ... 79 more

3.2 源码解析:

    void socketConnect(InetAddress address, int port, int timeout)
        throws IOException {
        int nativefd = checkAndReturnNativeFD();
 
        int connectResult;
        if (timeout <= 0) {  //不设置timeout的时间,永久阻塞
            connectResult = connect0(nativefd, address, port);
        } else {  //设置timeout时间,只等timeout时间。
            configureBlocking(nativefd, false);  //先设置为非阻塞模式
            try {
                connectResult = connect0(nativefd, address, port); //做个连接
                if (connectResult == WOULDBLOCK) { 
                    waitForConnect(nativefd, timeout); //最多等待阻塞timoout时间
                }
            } finally {
                configureBlocking(nativefd, true);  //设置回阻塞模式
            }
        }
 
    }
 

3.3 设置方法:

  		RequestConfig requestConfig = RequestConfig.custom().
 		setConnectTimeout(2 * 1000).

(4)数据处理时间:

发送完请求后,等待响应需要一定的时间,这个时间即为数据处理时间。除此之外,建立连接后的tls握手过程,也有多个请求响应过程,这个时间参数对这个交互过程也适用。

4.1 错误堆栈:

Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
        at java.net.SocketInputStream.read(SocketInputStream.java:171)
        at java.net.SocketInputStream.read(SocketInputStream.java:141)
        at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
        at sun.security.ssl.InputRecord.read(InputRecord.java:503)
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
        at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
        at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
        at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
        at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
        at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
        at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
        at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
        at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
        at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
        at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
        at org.apache.http.impl.execchain.ServiceUnavailableRetryExec.execute(ServiceUnavailableRetryExec.java:84)
        at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)

4.2 源码解析:

http发完请求后,立马阻塞读取响应:

    public HttpResponse execute(
            final HttpRequest request,
            final HttpClientConnection conn,
            final HttpContext context) throws IOException, HttpException {
        try {
            HttpResponse response = doSendRequest(request, conn, context);
            if (response == null) {
                response = doReceiveResponse(request, conn, context); //阻塞读取响应
            }
            return response;
        }
    /**
     *  Enable/disable {@link SocketOptions#SO_TIMEOUT SO_TIMEOUT}
     *  with the specified timeout, in milliseconds. With this option set
     *  to a non-zero timeout, a read() call on the InputStream associated with
     *  this Socket will block for only this amount of time.  If the timeout
     *  expires, a <B>java.net.SocketTimeoutException</B> is raised, though the
     *  Socket is still valid. The option <B>must</B> be enabled
     *  prior to entering the blocking operation to have effect. The
     *  timeout must be {@code > 0}.
     *  A timeout of zero is interpreted as an infinite timeout.
     *
     * @param timeout the specified timeout, in milliseconds.
     * @exception SocketException if there is an error
     * in the underlying protocol, such as a TCP error.
     * @since   JDK 1.1
     * @see #getSoTimeout()
     */
    public synchronized void setSoTimeout(int timeout) throws SocketException {  //仅仅对读有效
        if (isClosed())
            throw new SocketException("Socket is closed");
        if (timeout < 0)
          throw new IllegalArgumentException("timeout can't be negative");

        getImpl().setOption(SocketOptions.SO_TIMEOUT, new Integer(timeout));
    }
    /**
     * Reads into an array of bytes at the specified offset using
     * the received socket primitive.
     * @param fd the FileDescriptor
     * @param b the buffer into which the data is read
     * @param off the start offset of the data
     * @param len the maximum number of bytes read
     * @param timeout the read timeout in ms
     * @return the actual number of bytes read, -1 is
     *          returned when the end of the stream is reached.
     * @exception IOException If an I/O error has occurred.
     */
    private native int socketRead0(FileDescriptor fd,
                                   byte b[], int off, int len,
                                   int timeout) //阻塞读取timeout时间

4.3 设置方法:

  	RequestConfig requestConfig = RequestConfig.custom().
 	setSocketTimeout(requestTimeoutInSecond * 1000).

假设不设置so_timeout的同时,设置了上文(3)中提到的connection timeout,则so_timeout = connection timeout

org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(int, Socket, HttpHost, InetSocketAddress, InetSocketAddress, HttpContext)
            if (connectTimeout > 0 && sock.getSoTimeout() == 0) {
                sock.setSoTimeout(connectTimeout);
            }

总结:
除了以上几种直接影响处理耗时的timeout参数为,尝试机制是另外一个影响处理耗时的机制,而且基本都是“翻倍”,这里不做赘述。所以从这个角度看,影响一次请求的耗时,主要包括以下因素:
(1) GC的STW效应
(2) 虚拟机的影响
(3) 应用层/系统层各种timeout参数
(4) 应用层/系统层重试次数

通过上文可知,除非将一个请求过程完全异步化,否则必须将所有的参数都了然于心才能很有信心的了解一个请求到底最长需要多久来完成,但是异步化的缺点就是管理的复杂性、缺乏负反馈等问题,需要综合权衡。