A cap container includes a tank, a conical protrusion, and a stepped coupling. The tank includes an upper rim, a bottom opposite to the upper rim, and a hole formed through the bottom. The conical protrusion includes a lower end, an upper end, an inner wall extending from the lower end to the upper end and defining a channel, and an engagement part formed at the inner wall near the lower end. The channel passes through the lower end and the upper end and communicates with the hole of the tank, and the channel can receive a straw. The stepped coupling has at least one annular groove and is formed at the bottom of the tank. The cap container can be filled with solid food and combined with a top a commercially available drink container such as a bottle or a cup filled with liquid by the engagement part of the conical protrusion. A user can hold the bottle or the cup combined with the cap container by a hand, and the other hand can swing freely or take the solid food in the tank, such that the user can also suck the liquid from the bottle or cup by the straw at the same time.