Analyzing Hybrid and Electric Vehicle Market Share

Recently, several governments including the United Kingdom, Japan, and California have announced plans to phase out new purely gasoline powered cars in favor of hybrid and electric vehicles. These three constitute the world's 5th, 3rd, and, if it were a country, 5th largest economies respectively.

While pretty much everyone expected pure gasoline powered vehicles to be phased out eventually, many people were surprised by the timetables proposed varying from 2030 to 2035. Most countries with large economies who have announced a phase out plan place their target dates between 2040 and 2050. India's optimistic 2030 target for full electrification of all vehicles is an outlier.

Currently, hybrid and electric vechicles are a small portion of the overal market. For EVs specifically, supporting infrastructure ranges from good to nonexistent depending on the area.

In this example, we'll use the CIS Automotive API to analyze the growth of electric and hybrid vehicles in Washington State. Washington has selected a 2050 target for net zero carbon emissions and in March passed a law requiring at least 5% of new vehicle sales to be EVs. Lawmakers considerd creating a gasoline powered car ban that would have taken effect in 2030, but did not pass it into law. Washington enjoys a large amount of cheap hydro-electric power, so it makes EVs an attractive choice.

Ford New Inventory Makeup in Washington State by Fuel Type. 2018-2020 Rise of Ford hybrid market share in Washington State 2018-2020
 

Pulling the listing data

We'll use the /listingsByRegionAndDate endpoint to search historical snapshots of the car dealer's inventories in Washington. With this endpoint, we can pull from our our vehicle database as far back as 2016 or from another region. For this example, we'll look at inventory for November of the last three years.

Because we're going to be working with a lot of data, we're going to use multithreading to make many requests in parallel. This example is pretty simple as far as multithreading goes, but there are a few gotchas when passing data between threads.

  • We'll want to share authentication tokens between threads to cut down on the total number of requests we make, so we'll be passing json representations of our API driver's state to the asynchronous code.
  • If we just created new CisApi objects in each thread without sharing tokens between them, our performance would suffer because each object would unnecessarily re-authenticate itself and double the number of requests we made.

We could also sign up for an enterprise plan to get a vehicle datadump that would speed up our analysis even more. This particular use case would benefit significantly from running queries against a sql database.

Example Code


from cisapi import CisApi
from datetime import date, timedelta
import multiprocessing

#wrapper to pull data asyncronisly
def asyncGet(apiDict, region, modelName, startDate, endDate, page, newCars):
    api=CisApi()
    api.fromDict(apiDict) #load shared token to cut down on total requests
    return api.listingsByRegionAndDate(region, modelName, startDate, 
                endDate, page, newCars)

#helper function for bookeeping
def incMap(m, key):
    if(not(key in m)):
        m[key]=0
    m[key]=m[key]+1


if(__name__=="__main__"):
    #this if statement is required because of an artifact
    #of how multithreading works in python
    #any code outside this if statement will run with every async call
    api=CisApi()
    #pick the regions and months we'll analyze
    region="REGION_STATE_WA"
    brandNames=["Ford", "Chevrolet", "Toyota"]
    months=["2020-11-01", "2019-11-01","2018-11-01"]
    newCars=True
    threadCount=8
    pool=multiprocessing.Pool(threadCount)
    listingData={}

    listingData[region]={}
    for brand in brandNames:
        models=api.getModels(brand)
        if(not (brand in listingData[region])):
            listingData[region][brand]={}
        for model in models["data"]:
            modelName=model["modelName"]
            for month in months:
                if(not(month in listingData[region][brand])):
                    listingData[region][brand][month]=[]
                maxPages=99
                page=1
                startDate=month
                jobs=[]
                endDate=date.fromisoformat(month)+timedelta(30)
                endDate=endDate.isoformat()
                res= api.listingsByRegionAndDate(region, modelName, 
                        startDate, endDate, page, newCars=True)
                respData=res["data"]
                maxPages=respData["maxPages"]
                for listing in respData["listings"]:
                        listingData[region][brand][month].append(listing)
                for i in range(2,maxPages+1):
                    pass
                    if(api.needsRefresh(safetyFactor=3.0)):
                        #make sure token is valid before passing
                        #it to sub processes
                        api.getToken()
                    j=pool.apply_async(asyncGet, [api.toDict(),
                        region, modelName, startDate, endDate, i, newCars])
                    jobs.append(j)
                for job in jobs:
                    respData=job.get()["data"]
                    for listing in respData["listings"]:
                        listingData[region][brand][month].append(listing)
        
    pool.close()
    pool.join()
Chevrolet New Inventory Makeup in Washington State by Fuel Type. 2018-2020 Chevrolet's switch from hybrid vehicles to electric vehicles in Washington State 2018-2020

Running our market statistics

Now that we've pulled our inventory data we can run some analysis on it. We'll count up the number of new hybrid and electric vehicles to determine their marketshare for each brand over the three years we're looking at. We can see different stories from the different renewable strategies each brand is following from this data.

It's easy to see how GM (owner of Chevrolet) has transitioned away from hybrids towards EVs. This trend lines up with public statements made by the company that downplay hybrids and favor EVs. GM views EVs as the future and considers hybrids to just be a tool to help transition away from gasoline powered vehicles. The large increase in Chevrolet EVs available in 2020 was also likely driven by the change to Washington State law mentioned earlier.

Ford showed a low persence of hybrids for 2018 and 2019, but had a large increase in the market shares of hybrids in 2020. Ford has announced more EV support including an electric version of the popular Transit Van, an electric F-150, and the Mach-E, an electric mustang, but so far their EV offering is slim.

Toyota has had consistently strong hybrid options available in large quantities, which isn't surprising considering the popularity of the Prius and the hybrid versions of Toyota's other vehicles. Like Ford, we can also see that their EV offering in the US is lacking, but that is because Toyota believes hybrids make better use of the existing battery supply. Hybrids require a much smaller battery capacity than pure EVs so Toyota can make a lot more hybrids than EVs with the same number of batteries. However Toyota has announced a shift towards more EV options to stay relevant in the locations that include hybrids in their gasoline powered vehicle phase out. Toyota also offers the hydrogen powered Mirai, but availability is limited to select areas of California and Hawaii, so it won't show up in this analysis.

Toyota New Inventory Makeup in Washington State by Fuel Type. 2018-2020 Toyota hybrid market share in Washington State 2018-2020
 

Example Code


#prepare grouped counts
computedStats={}
computedStats[region]={}
for brand in brandNames:
    computedStats[region][brand]={}
    for month in months:
        computedStats[region][brand][month]={"hybrid":0, "ev":0, "total":0}

gas="Gasoline"
electric="Electric"
#count the number of cars for each fuel
#and print the results
for brand in brandNames:
    for month in months:
        listings=listingData[region][brand][month]
        for listing in listings:
            primaryFuel=listing["vinDecode"]["FuelTypePrimary"]
            secondaryFuel=listing["vinDecode"]["FuelTypeSecondary"]
            if((primaryFuel==gas and secondaryFuel==electric) 
                or (primaryFuel==electric and secondaryFuel==gas)):
                #hybrid
                #primary and secondary fuel label order for hybrids vary 
                #depending on model and trim
                incMap(computedStats[region][brand][month],"hybrid")
            elif(primaryFuel==electric):
                incMap(computedStats[region][brand][month],"ev")
            incMap(computedStats[region][brand][month],"total")
        total=computedStats[region][brand][month]["total"]
        evCount=computedStats[region][brand][month]["ev"]
        hybridCount=computedStats[region][brand][month]["hybrid"]
        evPercent=float(evCount)/total*100
        hybridPercent=float(hybridCount)/total*100
        gasPercent=100-evPercent-hybridPercent
        print("brand: "+brand+" month: "+month+" gasPercent:"+str(gasPercent)
            +" hybridPercent:"+str(hybridPercent)+" evPercent:"+str(evPercent))
        
    

Conclusion

We've pulled dealership inventory data to look at the disposition of each brand's hybrid and electric vehicles. Our analyses reveal some significant increases in hybrid and EV market share over the last few years. We expect this market share to increase as more hybrid and EV options come to market, as charging infrastructure improves, and as the gasoline powered vehicle ban grows nearer.

If you'd like to extend our demo code to perform analysis on other states to see how electric prices might influence EV market share, you can sign up here. Large scale analysis like this particular use case really benefits from a datadump, so feel free to contact us about our enterprise options.


from cisapi import CisApi
from datetime import date, timedelta
import multiprocessing

#wrapper to pull data asyncronisly
def asyncGet(apiDict, region, modelName, startDate, endDate, page, newCars):
    api=CisApi()
    api.fromDict(apiDict) #load shared token to cut down on total requests
    return api.listingsByRegionAndDate(region, modelName, startDate, 
                endDate, page, newCars)

#helper function for bookeeping
def incMap(m, key):
    if(not(key in m)):
        m[key]=0
    m[key]=m[key]+1


if(__name__=="__main__"):
    #this if statement is required because of an artifact
    #of how multithreading works in python
    #any code outside this if statement will run with every async call
    api=CisApi()
    #pick the regions and months we'll analyze
    region="REGION_STATE_WA"
    brandNames=["Ford", "Chevrolet", "Toyota"]
    months=["2020-11-01", "2019-11-01","2018-11-01"]
    newCars=True
    threadCount=8
    pool=multiprocessing.Pool(threadCount)
    listingData={}

    listingData[region]={}
    for brand in brandNames:
        models=api.getModels(brand)
        if(not (brand in listingData[region])):
            listingData[region][brand]={}
        for model in models["data"]:
            modelName=model["modelName"]
            for month in months:
                if(not(month in listingData[region][brand])):
                    listingData[region][brand][month]=[]
                maxPages=99
                page=1
                startDate=month
                jobs=[]
                endDate=date.fromisoformat(month)+timedelta(30)
                endDate=endDate.isoformat()
                res= api.listingsByRegionAndDate(region, modelName, 
                        startDate, endDate, page, newCars=True)
                respData=res["data"]
                maxPages=respData["maxPages"]
                for listing in respData["listings"]:
                        listingData[region][brand][month].append(listing)
                for i in range(2,maxPages+1):
                    pass
                    if(api.needsRefresh(safetyFactor=3.0)):
                        #make sure token is valid before passing
                        #it to sub processes
                        api.getToken()
                    j=pool.apply_async(asyncGet, [api.toDict(),
                        region, modelName, startDate, endDate, i, newCars])
                    jobs.append(j)
                for job in jobs:
                    respData=job.get()["data"]
                    for listing in respData["listings"]:
                        listingData[region][brand][month].append(listing)
        
    pool.close()
    pool.join()

#prepare grouped counts
computedStats={}
computedStats[region]={}
for brand in brandNames:
    computedStats[region][brand]={}
    for month in months:
        computedStats[region][brand][month]={"hybrid":0, "ev":0, "total":0}

gas="Gasoline"
electric="Electric"
#count the number of cars for each fuel
#and print the results
for brand in brandNames:
    for month in months:
        listings=listingData[region][brand][month]
        for listing in listings:
            primaryFuel=listing["vinDecode"]["FuelTypePrimary"]
            secondaryFuel=listing["vinDecode"]["FuelTypeSecondary"]
            if((primaryFuel==gas and secondaryFuel==electric) 
                or (primaryFuel==electric and secondaryFuel==gas)):
                #hybrid
                #primary and secondary fuel label order for hybrids vary 
                #depending on model and trim
                incMap(computedStats[region][brand][month],"hybrid")
            elif(primaryFuel==electric):
                incMap(computedStats[region][brand][month],"ev")
            incMap(computedStats[region][brand][month],"total")
        total=computedStats[region][brand][month]["total"]
        evCount=computedStats[region][brand][month]["ev"]
        hybridCount=computedStats[region][brand][month]["hybrid"]
        evPercent=float(evCount)/total*100
        hybridPercent=float(hybridCount)/total*100
        gasPercent=100-evPercent-hybridPercent
        print("brand: "+brand+" month: "+month+" gasPercent:"+str(gasPercent)
            +" hybridPercent:"+str(hybridPercent)+" evPercent:"+str(evPercent))