How to Remove Duplicate Items from an Array in Perl

Working with arrays in programming often leads you to a common challenge: how to handle duplicate values. If you’re a Perl programmer facing this problem, you might be unsure of how to effectively remove duplicate items from an array.

In this blog post, we’ll walk through the specific technique to eliminate duplicates from an array in Perl, providing you with a custom solution as well as a look at built-in functions available in newer versions of Perl.

Understanding the Problem

Consider an example where you have the following array in Perl:

my @my_array = ("one", "two", "three", "two", "three");

In this array, the values “two” and “three” appear more than once, making it a candidate for de-duplication. The goal here is to transform this array into a unique list, essentially getting rid of these duplicates, so you end up with:

one two three

Solution: Custom Approach

Creating a Unique Function

A simple and effective way to remove duplicates is by creating a custom function. Below is a function named uniq that you can use:

sub uniq {
    my %seen;
    grep !$seen{$_}++, @_;
}

Breaking Down the Function

  1. %seen: This is a hash that will store the elements as keys. Hashes in Perl only allow unique keys, which makes this perfect for tracking duplicates.

  2. grep: This function iterates over the list and evaluates the condition. Here, !$seen{$_}++ checks if the current item ($_) has been seen (i.e., exists in %seen). If it hasn’t, it adds it to the hash and returns true, allowing grep to keep this item in the output.

Applying the Function

You can apply the function to your array as follows:

my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n"; # This prints: one two three

Testing the Output

After running this code, your output will display the filtered array:

one two three

This demonstrates that the duplicates have successfully been removed!

Using Built-in Functions

If you’re using Perl version 5.26.0 or later, you can take advantage of built-in modules for a more universal solution:

List::Util Module

The uniq function from the List::Util module handles duplicates efficiently. To use it:

  1. Make sure your Perl version is up-to-date.
  2. Install the module (if needed).
  3. Use uniq directly on your array.

Example:

use List::Util 'uniq';
my @array = qw(one two three two three);
my @filtered = uniq(@array);

print "@filtered\n"; # This prints: one two three

What Makes Built-in Uniq Better?

  • Efficiency: Built-in functions are typically optimized for performance.
  • Handling undefined values: List::Util::uniq treats undef as a separate value.
  • No warnings: This method won’t issue warnings related to duplicate values.

Conclusion

Removing duplicate items from an array in Perl can be done effectively through a custom function or by leveraging built-in capabilities, especially with List::Util. Whichever method you choose, you can ensure that cleanup of your arrays is straightforward and efficient. Now you can tackle arrays with confidence!

Practice these techniques in your Perl projects to refine your skills!