Information redundancy becomes a crucial problem in the Web when contents from different resources are automatically combined to produce a new WWW- publication. Information retrieval, natural language processing and the latest WWW-activities offer a challenging framework to approach the information redundancy problem of automatically combined news articles. It seems reasonable, that minimising information redundancy should be performed by a hybrid technique that combines some elements of these approaches. The purpose of this exploratory study is to introduce a theoretical and practical framework for clarifying the information redundancy problem in the case of integrated publishing.