There are a couple of different answers to your question.
The first answer is that the Implicit Function Theorem states that we are fully allowed to treat an implicit function as if it were an explicit function if we can "zoom in" enough on the function so that it looks like an explicit function within a neighborhood.
The reason for this is that an implicit function can always be transformed into an explicit function by adding a parameter which tells you which particular function you are looking at. Additionally, this parameter can be thought of as a constant within the neighborhood, and therefore doesn't affect differentiation.
Imagine $y = \pm\sqrt{x}$. $y$ is not a true function of $x$. However, $y$ generally acts like a function if it is sufficiently zoomed in. Therefore, we can imagine a parameter $q$ which is attached to the function, which distinguishing which path to take. So, instead of $y = f(x)$, $y$ is actually $f(x, q)$. We can define it like this:
$$ f(x, q) = \begin{cases}+\sqrt{x}, \text{ if } q = 0 \\ -\sqrt{x}, \text{ if } q = 1 \end{cases} $$
This is a differentiable function everywhere except the crossover points. Additionally, we can imagine $q$ to be constant everywhere except the crossover points as well, so it doesn't actually alter the derivative (except, again, at the crossover points).
Therefore, implicit functions can be treated as full functions so long as they look like functions when sufficiently zoomed in.
However, one of the reasons that you are probably tripping up is that you were taught to take full derivatives. Personally, I find that this confuses things more than it helps. I find it more helpful to always focus on differentials first, derivatives second.
Example: The derivative of $x^2$ is $2x$, but the differential of $x^2$ is $2x\,dx$. Because you aren't differentiating with respect to anything, the process doesn't have to care what you are differentiating with respect to. Additionally, all of the derivative rules are trivially converted to differential rules. This seems like a trivial difference until you get to more complicated formulas.
For instance, if I have $x^2 + z^3 + xy = 5$, I can differentiate this into $2x\,dx + 3z^2\,dz + x\,dy + y\,dx = 0$ - that is just a direct application of differential rules. Now that I have the differential, finding any derivative is merely solving for it. If I wanted to find $\frac{dy}{dx}$ I can just solve for it:
$$
2x\,dx + 3z^2\,dz + x\,dy + y\,dx = 0 \\
x\,dy = -2x\,dx - 3z^2\,dz - y\,dx \\
\frac{dy}{dx} = -2 - \frac{3z^2}{x}\frac{dz}{dx} - \frac{y}{x}
$$
The way to read this equation is to say that the slope between y and x depends not only on the particular position on the graph, but also on the slope being used between z and x.
I prefer this process because it unifies the differentiation process between explicit derivatives, implicit derivatives, and multivariable total derivatives.
Additionally, you can move from this to partial derivatives just by setting all of the non-participating differentials to 0. So to find $\frac{\partial y}{\partial x}$ you just set $dz = 0$, which gives:
$$ \frac{\partial y}{\partial x} = -2 - \frac{y}{x} $$