Very long instruction word- (VLIW-) based processors have become widely adopted as a basic building block in modern System-on-Chip designs. Advances in clustered VLIW architectures have extended the scalability of the VLIW architecture paradigm to a large number of functional units and very-wide-issue widths. A central challenge with wide-issue clustered VLIW architecture is the availability of programming and automated compiler methods that can fully utilize the available computational resources. Existing compilation approaches for clustered-VLIW architectures are based on extensions of previously developed scheduling algorithms that primarily focus on the maximization of instruction-level parallelism (ILP). However, many applications do not have sufficient ILP to fully utilize a large number of functional units. On the other hand, many applications in digital communications and multimedia processing exhibit enormous amounts of data-level parallelism (DLP). For these applications, the streaming programming paradigm has been developed to explicitly expose coarse-grained data-level parallelism as well as the locality of communication between coarse-grained computation kernels. In this paper, we investigate the mapping of stream programs to wide-issue clustered VLIW processors. Our work enables designers to leverage their existing investments in VLIW-based architecture platforms to harness the advantages of the stream programming paradigm.