2008年6月7日星期六

从IBM developerWorks上一篇关于Java内存泄漏的文章引起的讨论,内存泄漏产生 的原因和解决途径

在研究Java内存泄漏时,看到了这篇文章 http://www.ibm.com/developerworks/cn/java/l-JavaMemoryLeak/index.html

觉得其中的例子是有问题的,至少如果v(Vector对象)是局部变量的话就不会引起内存泄漏。

下面给出了一个简单的内存泄露的例子。在这个例子中,我们循环申请Object对象,并将所申请的对象放入一个Vector中,如果我们仅仅释放引用本身,那么Vector仍然引用该对象,所以这个对象对GC来说是不可回收的。因此,如果对象加入到Vector后,还必须从Vector中删除,最简单的方法就是将Vector对象设置为null。

Vector v=new Vector(10);
for (int i=1;i<100; i++)
{
Object o=new Object();
v.add(o);
o=null;
}

//此时,所有的Object对象都没有被释放,因为变量v引用这些对象。


同时,在www.matrix.org.cn上也找到了这篇文章,看到支持的批评的都有。


下面是一些有价值的一些观点:



1、这个根本不是计算机科学中所谓的“内存泄漏”,而是程序员心理上认为的“内存泄漏”,那个例子只能说是程序员有意想保存那些object在vector里,GC当然不会回收。


2、这不是内存泄漏的问题, java中不存在全局变量,因此上述的Vector肯定在一个Class之中,而当这个Class的对象不再被引用时Vector也就被释放了


3、我觉得这篇文章的例子不会存在内存泄漏啊,只要Vector不再被引用,不是就会释放所有的资源了吗?只不过是何时释放内存的问题罢了。作者是不是把没有及时释放内存也定义成了内存泄漏啊!这些内存最终都会释放的。不过作者的说法也不是没有借鉴意义,如果想及时的释放内存的话,还是需要考虑的。


4、“只要Vector不使用,GC还是会回收的,只是时间的问题。 ” ---------  如果是一个站点,同时在线人数多了,那些未释放的内存还不是让你的服务器完蛋。有内存在浪费而不懂得释放就像你花10块钱买别人1块钱能买到的东西一样浪费。总之一句话,编写程序时,发现能释放的,短时间内没用的马上就释放。


但是,当这个对象v是作为类的属性,特别是静态属性时,会引起很大的麻烦。


在《Effective Java 》中,第五条



Eliminate obsolete object references
When you switch from a language with manual memory management, such as C or C++, to a garbage-collected language, your job as a programmer is made much easier by the fact that your objects are automatically reclaimed when you're through with them. It seems almost like magic when you first experience it. It can easily lead to the impression that you don't have to think about memory management, but this isn't quite true.
Consider the following simple stack implementation:
Code View: Scroll / Show All
// Can you spot the "memory leak"?
public class Stack {
private Object[] elements;
private int size = 0;
private static final int DEFAULT_INITIAL_CAPACITY = 16;
public Stack() {
elements = new Object[DEFAULT_INITIAL_CAPACITY];
}
public void push(Object e) {
ensureCapacity();
elements[size++] = e;
}
public Object pop() {
if (size == 0)
throw new EmptyStackException();
return elements[--size];
}
/**
* Ensure space for at least one more element, roughly
* doubling the capacity each time the array needs to grow.
*/
private void ensureCapacity() {
if (elements.length == size)
elements = Arrays.copyOf(elements, 2 * size + 1);
}
}
There's nothing obviously wrong with this program (but see Item 26 for a generic version). You could test it exhaustively, and it would pass every test with flying colors, but there's a problem lurking. Loosely speaking, the program has a "memory leak," which can silently manifest itself as reduced performance due to increased garbage collector activity or increased memory footprint. In extreme cases, such memory leaks can cause disk paging and even program failure with an OutOfMemoryError, but such failures are relatively rare.
So where is the memory leak? If a stack grows and then shrinks, the objects that were popped off the stack will not be garbage collected, even if the program using the stack has no more references to them. This is because the stack maintains obsolete references to these objects. An obsolete reference is simply a reference that will never be dereferenced again. In this case, any references outside of the "active portion" of the element array are obsolete. The active portion consists of the elements whose index is less than size.
Memory leaks in garbage-collected languages (more properly known as unintentional object retentions) are insidious. If an object reference is unintentionally retained, not only is that object excluded from garbage collection, but so too are any objects referenced by that object, and so on. Even if only a few object references are unintentionally retained, many, many objects may be prevented from being garbage collected, with potentially large effects on performance.
The fix for this sort of problem is simple: null out references once they become obsolete. In the case of our Stack class, the reference to an item becomes obsolete as soon as it's popped off the stack. The corrected version of the pop method looks like this:
public Object pop() {
if (size == 0)
throw new EmptyStackException();
Object result = elements[--size];
elements[size] = null; // Eliminate obsolete reference
return result;
}
An added benefit of nulling out obsolete references is that, if they are subsequently dereferenced by mistake, the program will immediately fail with a NullPointerException, rather than quietly doing the wrong thing. It is always beneficial to detect programming errors as quickly as possible.
When programmers are first stung by this problem, they may overcompensate by nulling out every object reference as soon as the program is finished using it. This is neither necessary nor desirable, as it clutters up the program unnecessarily. Nulling out object references should be the exception rather than the norm. The best way to eliminate an obsolete reference is to let the variable that contained the reference fall out of scope. This occurs naturally if you define each variable in the narrowest possible scope (Item 45).
So when should you null out a reference? What aspect of the Stack class makes it susceptible to memory leaks? Simply put, it manages its own memory. The storage pool consists of the elements of the elements array (the object reference cells, not the objects themselves). The elements in the active portion of the array (as defined earlier) are allocated, and those in the remainder of the array are free. The garbage collector has no way of knowing this; to the garbage collector, all of the object references in the elements array are equally valid. Only the programmer knows that the inactive portion of the array is unimportant. The programmer effectively communicates this fact to the garbage collector by manually nulling out array elements as soon as they become part of the inactive portion.
Generally speaking, whenever a class manages its own memory, the programmer should be alert for memory leaks. Whenever an element is freed, any object references contained in the element should be nulled out.
Another common source of memory leaks is caches. Once you put an object reference into a cache, it's easy to forget that it's there and leave it in the cache long after it becomes irrelevant. There are several solutions to this problem. If you're lucky enough to implement a cache for which an entry is relevant exactly so long as there are references to its key outside of the cache, represent the cache as a WeakHashMap; entries will be removed automatically after they become obsolete. Remember that WeakHashMap is useful only if the desired lifetime of cache entries is determined by external references to the key, not the value.
More commonly, the useful lifetime of a cache entry is less well defined, with entries becoming less valuable over time. Under these circumstances, the cache should occasionally be cleansed of entries that have fallen into disuse. This can be done by a background thread (perhaps a Timer or ScheduledThreadPoolExecutor) or as a side effect of adding new entries to the cache. The LinkedHashMap class facilitates the latter approach with its removeEldestEntry method. For more sophisticated caches, you may need to use java.lang.ref directly.
A third common source of memory leaks is listeners and other callbacks. If you implement an API where clients register callbacks but don't deregister them explicitly, they will accumulate unless you take some action. The best way to ensure that callbacks are garbage collected promptly is to store only weak references to them, for instance, by storing them only as keys in a WeakHashMap.
Because memory leaks typically do not manifest themselves as obvious failures, they may remain present in a system for years. They are typically discovered only as a result of careful code inspection or with the aid of a debugging tool known as a heap profiler. Therefore, it is very desirable to learn to anticipate problems like this before they occur and prevent them from happening.


我的理解是,这个并不是真正意义上的内存泄漏,而是由于JVM GC的不可控和不确定性引起的。Java里全是句柄,内存是托管的;我们不可能造成内存丢失,有可能犯的错误是仍旧把持着句柄不放,造成内存无法回收。


主要是针对生命周期比较长的对象。例如,对Effective Java Item 5中的这个数组进行pop操作之后,有很长时间未操作,那个所引用的对象在所在单元被覆盖之前就不会被回收,造成所谓的“内存泄漏”。


当然,当这个对象消亡时,所引用的对象最终会释放。但是,如果有成千上万的用户访问时,集合一直保持着对实际已经不再使用的对象的引用,最终会导致OutOfMemory错误,这就是这儿“内存泄漏”的实质。


在编写程序时,应该尽量使用生命周期短的小对象。不要让生命周期很长的对象一直把持其他对象的引用不释放。


当然,如果上面例子中的属性是static的,那么问题就更严重了,因为一旦类被加载,那么该静态属性就一直存在了。


引用CCF会员太阳公公的话



在我看来Java内存泄漏主要存在两种地方:
1.静态变量:这个简单就不说了。
2.资源:涉及到数据库、网络、消息、文件等资源访问的地方。比如:曾经遇见Oracle JDBC的bug,一个查询结束了,却不释放连接。于是相关的资源统统不会释放。
3.死锁:例如线程死锁,资源是不会释放的。
通常情况下,写J2EE的Java程序,并不需要特别关注内存泄漏问题。
当然还有其他的情况造成JVM out of memory,并不一定是泄漏。
比如,曾经在一个高并发的Java程序中解析>50M的XML,造成内存溢出,即使那台服务器的内存有的是。我理解为JVM无法快速申请大量内存,配置JVM启动参数就可以解决了。
另外需要注意的是长时间的操作,系统不会释放这个操作中利用到的内存。
比如,通过一系列复杂查询统计后生成文件,这个过程如果比较长,这期间GC发生了也不会释放内存。
除此之外,JVM都有很多启动参数,是值得研究的。SUN JVM、BEA JRocket、IBM JVM的参数还有些不一样。


解决内存泄漏除了注意及时释放对象引用之外,还可以使用弱引用对象。IBM developerWorks上有两篇文章提到了这个问题:

http://www.ibm.com/developerworks/cn/java/j-refs/index.html

http://www.ibm.com/developerworks/cn/java/j-jtp11225/

根据上述两篇文章写了一小段测试代码:


import java.lang.ref.WeakReference;

public class WeakTest {
    public static void main(String[] args){
        String s=new String("Hello");
        WeakReference<String> wr=new WeakReference<String>(s);
        s=null;                                                                                //1
        System.gc();
        System.out.println(wr.get());
    }
}


其中s是强引用,wr是弱引用。当编号为1的语句存在时,最后输出的是null,说明gc之后弱引用的对象被回收了。

在Java中提供了现成的采用弱引用的集合类WeakHashMap。

没有评论:

发表评论