What is the best way to set a class value to a variable in Python if it exists in a dictionary?

Question

I have this code which sets values in a location class from values from JSON data.

class Location:
    pass
    
def create_location_class_from_json(json_data):
    location_obj = Location()
    name_str = "name"
    if name_str in json_data:
        location_obj.name = json_data["name"]
    address_str = "address"
    if address_str in json_data:
        address = json_data[address_str]
        street1_str = "street1"
        if street1_str in address:
            location_obj.street = address[street1_str]
        city_str = "city"
        if city_str in address:
            location_obj.city = address["city"]
        postal_code_str = "postalCode"
        if postal_code_str in address:
            location_obj.postal_code = address[postal_code_str]
        lat_str = "latitude"
        long_str = "longitude"
        if lat_str in address and long_str in address:
            location_obj.latlong = (address[lat_str],address[long_str])
        remote_str = "remote"
    if remote_str in json_data:
        location_obj.remote = json_data[remote_str]
    return location_obj
            
if __name__ == "__main__":
    json_data = {
        "name": "Park",
        "address": {
            "street1": "Street",
            "city": "City",
            "state": "ST",
            "countryCode": "US",
            "postalCode": "12345"
        },
        "latitude": "0",
        "longitude": "0",
        "remote": False,
    }
    location_obj = create_location_class_from_json(json_data)
    print(location_obj.name)

How can I improve the create_location_class_from_json() function in this code? I feel that that there is a better way to set the values in the location object than the method I used in that function.

J_H · Accepted Answer · 2024-06-27 16:37:06Z

tl;dr: It should be a @dataclass.

code which sets values in a location class from values from JSON data.

The vocabulary word you're looking for is "object attribute". The code sets attributes of a location object. Using such terminology will prove helpful when you're googling.

BTW, one can set a class attribute, offering global variable or singleton behavior, scoped to consumers of the class. But that's not something the OP code does.

naming

def create_location_class_from_json(json_data):

No, the class Location: statement is what creates a Location class. This function is creating a location object, an instance of that class. Simply name it create_location_from_json, please.

    location_obj = Location()

Just call it a location, please.

Suppose you opened a novel and started reading this totally normal dialog:

"Hello, Alice person," said Bob. "I am Bob person, pleased to meet you."

It's clear from context that there's some people in this situation; we don't need to call it out explicitly. So, too, with program identifiers.

optional entries

This code is smart enough to anticipate that an entry might be missing, and deal with it.

    if name_str in json_data:
        location_obj.name = json_data["name"]

That json_data["name"] dereference would blow up with KeyError if the name was missing.

But dict offers a more convenient way to cope with this common concern:

    location_obj.name = json_data.get("name")

This is a little different from the OP code's behavior, as it will assign .name = None in the missing case, but that's actually desirable -- see the ToC section.

hardcode

I feel your foo_str constants are perhaps a little over the top, given the strictly function local context, but let's go with it. You apparently meant to type "name" exactly once. So that would give us

    location_obj.name = json_data.get(name_str)

It's polite for a python class to tell the Gentle Reader about all its possible instance attributes, in one place. It serves as a heads up, here's what to expect as you're reading the rest of the code. So we might have a def __init__(self): constructor which assigns self.name = None, plus several other attributes.

But here, there's essentially no behavior to the class, it's just a bunch of storage locations. That is the perfect fit for a dataclass, so let's use that decorator, annotating types as we go.

from dataclasses import dataclass

@dataclass
class Location:
    name: str
    street: str
    city: str
    state: str
    postal_code: str
    lat_long: tuple[float, float]
    remote: bool

I renamed self.lat_long to match convention, and also changed its type to something more useful than strings. (Note that ZIP must remain str, due to e.g. Boston zipcodes like 02108.)

You could break out a separate class Address: if you want to mirror the nested JSON structure. I'm going with what you wrote, which is perfectly ordinary and fine.

Also, I understand why you called it json_data, and that's not such a terrible name, you could keep it. But simply data would be better, since it contains the result of a JSON parse. Or perhaps location_data.

creating an instance object

    location_obj = create_location_class_from_json(json_data)

I started out thinking that this should be as simple as

    location_obj = Location(**json_data)

However there are a few details to attend to:

extra fields, such as "countryCode"
combining the lat + long fields
renaming fields like "street1" -> "street"
renaming fields from camelCase to snake_case
nesting of fields like "address"

None of these are hard, merely tedious. It's a matter of modeling how your desired result differs from the input data that arrived in JSON form. Here is complete code for one take on that.

from dataclasses import dataclass
from pprint import pp


@dataclass
class Location:
    name: str
    street: str
    city: str
    state: str
    postal_code: str
    lat_long: tuple[float, float]
    remote: bool


if __name__ == "__main__":
    json_data = {
        "name": "Park",
        "address": {
            "street": "Street",
            "city": "City",
            "state": "ST",
            "countryCode": "US",
            "postal_code": "12345",
        },
        "latitude": "0",
        "longitude": "0",
        "remote": False,
    }
    d = json_data.copy()
    d.update(**d.pop("address"))
    del d["countryCode"]
    d["lat_long"] = (float(d.pop("latitude")), float(d.pop("longitude")))

    location_obj = Location(**d)
    print(location_obj.name, "\n")
    print(location_obj, "\n")
    pp(location_obj)

output:

Park 

Location(name='Park', street='Street', city='City', state='ST', postal_code='12345', lat_long=(0.0, 0.0), remote=False) 

Location(name='Park',
         street='Street',
         city='City',
         state='ST',
         postal_code='12345',
         lat_long=(0.0, 0.0),
         remote=False)

camelCase translation

Doing "$ pip install dataclass-wizard" can help with that need to cheat on the "postalCode" rename. However it bumps up against the other mismatches mentioned above, so I'll omit an example here.

optional entries, revisited

Suppose the incoming JSON data sometimes omits any mention of remote status. The class definition could cope with that in this way.

    ...
    remote: bool | None = None

Now if we del d["remote"] we see output ending with

         ...
         lat_long=(0.0, 0.0),
         remote=None)

This still differs from the OP code's behavior, which wouldn't define a .remote attribute at all. As mentioned above, that's probably not a great design decision. Most consumers of the Location class will expect that all attributes mentioned in the constructor's ToC are available for dereferencing. I encourage you to design classes with that in mind.

main guard

Thank you for including this.

if __name__ == "__main__":

It allows other modules to safely import this one without side effects. Please keep up that good habit.

SUTerliakov · Accepted Answer · 2024-06-27 15:14:29Z

Correctness

remote_str is defined at wrong indentation level, it will fail with an UnboundLocalError for {"remote": False} input. If that input is invalid, please at least document it somewhere. Also you expect latitude and longitude in address field, but your JSON example shows them in the root - probably something is wrong there? So...

Testing

To prevent such problems, consider writing a small unittest or pytest test suite - something that we'll run to ensure code works properly.

The tests should not report the result. They should assert something and either pass or fail, giving the gentle reader a green checkmark on success. We should not inspect the output to verify its correctness.

Readability

Honestly, just repeating key names twice would be more readable, IMO, but your solution makes sense if these names are really going to be changed a lot. However, it would be beneficial to define the key names somewhere together, not scattering across the whole implementation.

However, there are ways to declutter that further. If you don't need legacy (3.7 and below) support, then walrus operator might greatly help here together with dict.get method.

Here's what your function would look like with only this change:

def create_location_class_from_json(json_data):
    location_obj = Location()

    if (name := json_data.get("name")) is not None:
        location_obj.name = name
    if (address := json_data.get("address")) is not None:
        if (street := address.get("street1")) is not None:
            location_obj.street = street
        if (city := address.get("city")) is not None:
            location_obj.city = city
        if (postal_code := address.get("postalCode")) is not None:
            location_obj.postal_code = postal_code
        if (
            (lat := address.get("latitude")) is not None
            and (lon := address.get("longitude")) is not None
        ):
            location_obj.latlong = (lat, lon)
    if (remote := json_data.get("remote")) is not None:
        location_obj.remote = remote

    return location_obj

Interface

You are using a Location object as a namespace, this is reasonable, but usually classes should follow some known interface. Your Location class should be in charge of defining what attributes should be set on it and read from it, and probably encapsulate setting altogether. If your later code relies on hasattr to check whether name is set for given Location, probably it also needs refactoring? Usually hasattr is a nuclear weapon for some special cases, not for daily usage of classes you fully control.

Your code contradicts your JSON example (latitude and longitude position). I'm adapting the structure from your example here and below.

Now, let's try to use dataclass: it will generate __init__, __repr__ and several other methods for you automatically. Missing attributes will default to None.

@dataclass(slots=True)
class Location:
    name: str | None = None
    street: str | None = None
    city: str | None = None
    postal_code: str | None = None
    # Why are these strings? Usually lat&long are numbers...
    latlong: tuple[str, str] | None = None
    # You may want to use True or False as default, omitting None here
    remote: bool | None = None


def create_location_class_from_json(json_data):
    address = json_data.get("address", {})
    # use `address.get` below if your code was right and example - wrong
    lat = json_data.get("latitude")
    lon = json_data.get("longitude")
    return Location(
        name=json_data.get("name"),
        street=address.get("street1"),
        city=address.get("city"),
        postal_code=address.get("postalCode"),
        latlong=(lat, lon) if None not in (lat, lon) else None,
        remote=json_data.get("remote"),
    )

Nice, huh? However, this function is now clearly an alternative constructor for the class, so we can put it inside Location as a classmethod:

@dataclass(slots=True)
class Location:
    ...  # As above

    @classmethod
    def from_json(cls, json_data):
        ...  # Implementation of `create_location_class_from_json` here

Now you can use Location.from_json(data) - this is quite self-explanatory.

Field types

It might be correct for your domain, but I won't expect latlong to be a tuple of two strings. float? Okay. Decimal? Probably also okay. But a string is not the type users will expect for the latitude field which is semantically a number.

Validation

Your code assumes that values are correct if given - e.g. no one will pass {"remote": "YES!"} to it. It might be fine for your case, but might be not.

Existing tools

If you can consider external solutions - perhaps pydantic model can fit much better here? It will give you great validation out of the box. With pydantic, your example could become something like this:

from pydantic import BaseModel, Field, model_validator


class NestedAddress(BaseModel):
    street: str | None = Field(default=None, alias="street1")
    city: str | None = None
    postal_code: str | None = Field(default=None, alias="postalCode")


class Location(BaseModel):
    name: str | None = None
    street: str | None = None
    city: str | None = None
    postal_code: str | None = None
    latlong: tuple[str, str] | None = None
    remote: bool | None = None

    @model_validator(mode="before")
    def flatten_address(cls, values):
        raw_address = values.get("address")
        if raw_address is None:
            return values
        address = NestedAddress.validate(raw_address)
        return {
            "street": address.street,
            "city": address.city,
            "postal_code": address.postal_code,
        } | values

    @model_validator(mode="before")
    def prepare_latlong(cls, values):
        lat = values.get("latitude")
        lon = values.get("longitude")
        if lat is None or lon is None:
            return values
        return {"latlong": (lat, lon)} | values

Now the call site can be like

import logging

from pydantic import ValidationError

if __name__ == "__main__":
    logging.basicConfig()

    json_data = {
        "name": "Park",
        "address": {
            "street1": "Street",
            "city": "City",
            "state": "ST",
            "countryCode": "US",
            "postalCode": "12345",
        },
        "latitude": "0",
        "longitude": "0",
        "remote": False,
    }

    try:
        location = Location.validate(json_data)
    except ValidationError:
        logging.exception("Invalid address")
    else:
        print(location)

    try:
        bad_location = Location.validate(json_data | {"name": 1})
    except ValidationError:
        logging.exception("Invalid address")
    else:
        print(bad_location)

    try:
        json_data['address']['postalCode'] = 1
        bad_location_2 = Location.validate(json_data)
    except ValidationError:
        logging.exception("Invalid address")
    else:
        print(bad_location_2)

This produces quite helpful errors for invalid inputs:

name='Park' street='Street' city='City' postal_code='12345' latlong=('0', '0') remote=False
ERROR:root:Invalid address
Traceback (most recent call last):
  File "/tmp/u.py", line 98, in <module>
    bad_location = Location.validate(json_data | {"name": 1})
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stas/Documents/Work/simpleem-be/venv/lib/python3.11/site-packages/pydantic/main.py", line 1344, in validate
    return cls.model_validate(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stas/Documents/Work/simpleem-be/venv/lib/python3.11/site-packages/pydantic/main.py", line 551, in model_validate
    return cls.__pydantic_validator__.validate_python(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for Location
name
  Input should be a valid string [type=string_type, input_value=1, input_type=int]
    For further information visit https://errors.pydantic.dev/2.7/v/string_type
ERROR:root:Invalid address
Traceback (most recent call last):
  File "/tmp/u.py", line 106, in <module>
    bad_location_2 = Location.validate(json_data)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stas/Documents/Work/simpleem-be/venv/lib/python3.11/site-packages/pydantic/main.py", line 1344, in validate
    return cls.model_validate(value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/stas/Documents/Work/simpleem-be/venv/lib/python3.11/site-packages/pydantic/main.py", line 551, in model_validate
    return cls.__pydantic_validator__.validate_python(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for Location
postalCode
  Input should be a valid string [type=string_type, input_value=1, input_type=int]
    For further information visit https://errors.pydantic.dev/2.7/v/string_type

First location was parsed correctly, while two following were malformed and did not pass the validation.

If your example was incorrect and data is structured as your code expects, remove the second validator, add lat&long fields to NestedAddress and handle them in flatten_address similar to dataclass snippet.

Stack Exchange Network

What is the best way to set a class value to a variable in Python if it exists in a dictionary?

2 Answers 2

naming

optional entries

hardcode

table of contents

creating an instance object

camelCase translation

optional entries, revisited

main guard

Correctness

Testing

Readability

Interface

Field types

Validation

Existing tools

Your Answer

Not the answer you're looking for? Browse other questions tagged
python
object-oriented
json
classes
or ask your own question.

Hot Network Questions

What is the best way to set a class value to a variable in Python if it exists in a dictionary?

2 Answers 2

naming

optional entries

hardcode

table of contents

creating an instance object

camelCase translation

optional entries, revisited

main guard

Correctness

Testing

Readability

Interface

Field types

Validation

Existing tools

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged pythonobject-orientedjsonclasses or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
python
object-oriented
json
classes
or ask your own question.