> For the complete documentation index, see [llms.txt](https://cs61b-2.gitbook.io/cs61b-textbook/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://cs61b-2.gitbook.io/cs61b-textbook/20.-hashing-ii/20.3-contains-and-duplicate-items.md).

# 20.3 Contains & Duplicate Items

## Contains

{% embed url="<https://www.youtube.com/watch?index=4&list=PL8FaHk7qbOD637Q-6p7nn5dKz1tK6WAjJ&v=O6UlxmISOx4>" %}

### The `equals()` Method for a ColoredNumber Object

Suppose the `equals()` method for ColoredNumber is as below, i.e. two ColoredNumbers are equal if they have the same num.

```java
@Override
public boolean equals(Object o) {
   if (o instanceof ColoredNumber otherCn) {
       return this.num == otherCn.num;
   }
   return false;
}
```

### HashSet Behavior for Checking `contains()`

Suppose the `equals()` method for ColoredNumber is on the previous slide, i.e. two ColoredNumbers are equal if they have the same num.&#x20;

```java
int N = 20;
HashSet<ColoredNumber> hs = new HashSet<>();
for (int i = 0; i < N; i += 1) {
    hs.add(new ColoredNumber(i));
}
```

Suppose we now check whether 12 is in the hash table.

```java
ColoredNumber twelve = new ColoredNumber(12);
hs.contains(twelve); //returns true
```

{% hint style="info" %}
What do we **expect** to be returned by `contains`?
{% endhint %}

<details>

<summary>Answer</summary>

We expect the `contains` call to be true, all `12`s are created equal!

</details>

### Finding an Item Using the Default Hashcode

Suppose we are using the default hash function (uses memory address):

```java
int N = 20;
HashSet<ColoredNumber> hs = new HashSet<>();
for (int i = 0; i < N; i += 1) {
   hs.add(new ColoredNumber(i));
}
ColoredNumber twelve = new ColoredNumber(12);
hs.contains(twelve); // returns ??
```

which yields the table below:

![](https://lh4.googleusercontent.com/cFvLVhOYg31lSg1a8moftrr30qpThw3Bc7drJWVLNrSTCgTdO4yxjis1epmRlRMLWIkh73alL6OrblQxqGGrjMo1XMoOfPjSFH3tPDDbaxSbLXe1-HXJcfQOglsU4Dp74PiDDqMWXNOmYeQu_D7-l_B2SBcdTVohYMYLZdnzzMVn8_hAbvSKDAKyC-_ev3nN=s2048)

Suppose equals returns true if two ColoredNumbers have the same num (as we've defined previously).

{% hint style="info" %}
What is actually returned by `contains`?
{% endhint %}

<details>

<summary>Answer</summary>

Returns false with probability 5/6ths.

Default `hashCode()` is based on memory address. equals is based on `num`.

There are two ColoredNumber objects with `num = 12`. One of them is in the HashSet and one of them was created by the code above.

Each memory address is random, with only a 1/6th chance they modulo to the same bucket.

Example: If the `ColoredNumber` object `twelve` created by the code above is in memory location 6000000, its hashCode % 6 is 0. HashSet looks in bucket zero, and doesn't find 12.

</details>

{% hint style="info" %}
Hard Question: If the default hash code achieves a good spread, why do we even bother to create custom hash functions?&#x20;
{% endhint %}

<details>

<summary>Answer</summary>

It is necessary to have consistency between `equals()` and `hashCode()` for the hash table's operations to function.

</details>

**Basic rule (also definition of deterministic property of a valid hashcode):** If two objects are equal, they **must** have the same hash code so the hash table can find it.

## Duplicate Values

{% embed url="<https://www.youtube.com/watch?index=5&list=PL8FaHk7qbOD637Q-6p7nn5dKz1tK6WAjJ&v=3HmCkRYsAGg>" %}

### Overriding `equals()` but Not `hashCode()`

Suppose we have the same `equals()` method (comparing `num`), but we do not override `hashCode()`.

```java
public boolean equals(Object o) {
   ...  return this.num == otherCn.num; ...
}
```

The result of adding 0 through 19 is shown below:

![](/files/gpIvSgJCkMl7Di5JPvsR)

```java
ColoredNumber zero = new ColoredNumber(0);
hs.add(zero); // does another zero appear?
```

{% hint style="info" %}
Which can happen when we call add(zero)?

Answer Choices:

1. We get a 0 to bin zero.
2. We add another 0 to bin one.
3. We add a 0 to some other bin.
4. We do not get a duplicate zero
   {% endhint %}

<details>

<summary>Answer </summary>

3 Choices are Correct:&#x20;

\#1, #3, #4.

We get a 0 to bin zero, We add a 0 to some other bin, and we do not get a duplicate zero.

The new zero ends up in a random bin.

* 5/6ths chance: In bin 0, 2, 3, 4, or 5. Duplicate!
* 1/6 chance: In bin 1, no duplicate! (`equals()` blocks it)

</details>

## Key Takeaway: `equals()` and `hashCode()`

Bottom line: If your class override equals, you should also override hashCode in a consistent manner.

* If two objects are equal, they must always have the same hash code.<br>

If you don’t, everything breaks:

* `Contains` can’t find objects (unless it gets lucky).
* `Add` results in duplicates.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://cs61b-2.gitbook.io/cs61b-textbook/20.-hashing-ii/20.3-contains-and-duplicate-items.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
