Creating a Python Module for Nepal Stock Exchange API
Bitter-sweet experience of creating an async PyPi module
Introduction
Nepal stock exchange has been gaining a lot of traction lately. I was interested so I wondered if I could get data for my discord bot. I found a project nepse, but I was not fond of that module.
Why I disliked nepse module
- The module was
synchronous
hence blocking - The module returned data in
dict
- The module returned data in
camelCase
- The codebase was not properly managed
- There was no proper documentation
Hence, I decided to create my own NEPSE API Wrapper.
How nepse-api will be better than nepse
- It will be
asynchronous
- It will return data as an
object
- It will return the data in
snake_case
- Its codebase will be properly managed
- It will have proper documentation
- It will fetch data from API rather than scraping,
- It will be type-hinted.
My Experience
How I started
I got started on my journey of getting Github Stars. I thought about how I was going to code it, and about my directory structure. I looked at other API Wrappers
and got the basic idea about my code/ directory structure from there.
The roadmap I created:
1. Creating python class for storing the data returned by API
It's not the best way, but I just simply started creating Python Classes
with the @dataclass
decorator by looking at what the API returned. Some of the fields were null
so it was a hassle to determine their data type; in the end, I just used the Any
data type from the typing
module. The attributes weren't that consistent so it was a hassle overall.
2. Creating various handlers for getting various types of data like company data, broker data, market data, etc
I got the inspiration from other API wrappers that I had seen in general. I created a MarketClient
, SecurityClient
, and BrokerClient
to get the market data, company data, and broker data respectively. Doing this was a good decision in my opinion because this made my codebase a bit more organized and easier to maintain.
3. Creating a caching system so that the user has the ability to either cache or fetch real-time data
I got the inspiration of doing this from discord.py. I had recently figured out proper ways of caching. I used cachetools for caching. I was very excited and made caching non-customizable. But After talking with @Santosh#2138
I figured out it was a bad idea to not provide the user with the choice to control caching. I fixed it and gave the user the choice to change the cache_retain_time
and cache_size
during initialization.
4. Making sure that the code is fully asynchronous and is not blocking
Just using async/ await
doesn't mean your code is automatically asynchronous. The code still might be blocking, so I had to make sure everything was asynchronous and nothing was blocking, because if that were the case; my main motive for making this API Wrapper would have failed. I accomplished it by using httpx.AsyncClient
instead of requests
for fetching the data from an API.
5. Converting the camelCase
to snake_case
One of my motives for creating this API Wrapper was to make it more pythonic by having the data returned in snake_case
rather than camelCase
. I had to think of something creative for this because API returned the data in camelCase
. I thought of using regex
but I thought it would be too slow. I created a monstrous converter but later I figured out that using regex
would make my code more maintainable and the performance gain won't be that noticeable, so I used the python module pyhumps
(which uses regex) in the end for converting camelCase
to snake_case
.
6. Creating tests for testing my module
I had heard about unit testing
, but I wasn't that familiar with it. I installed pytest
to get started with it. But shortly I figured out that you need to do a bit of extra work to run pytest in async code
. You can learn more about async code
unit testing here.
Basically you need to pip install pytest-asyncio
and your code should be like this:
@pytest.mark.asyncio
async def test_coroutine():
res = await long_computation()
assert res == "DONE"
7. Formatting the code
I used black
for code formatting and isort
for import sorting. These tools are very good and these make your code much more readable and elegant. I have been in love with these tools ever since I started using them. I would highly recommend you to use it as well.
8. Creating Documentation
As this was my first time creating a python package, I had no idea at all about creating documentation, but google and youtube had my back. I wrote doc-strings in Google Docstring Fomat. I used sphinx to generate auto-docs
from my code and used restructured text
for the documentation. You can check out the live version of my documentation here
9. Publishing module to PyPi
PyPi is a repository of python modules for the Python programming language. If you publish your package here as something-you-made
, people all over the world can install your package by just doing pip install something-you-made
. I had firstly used pipenv
for dependencies management, and setup.py
for publishing the package, but after I found out about Python Poetry; my life has been so much easier. It makes it very easy to manage dependencies for your package and publish it in PyPi as well.
10. Managing open source project
Even though I am a huge advocate of open source projects, there are problems that you face while maintaining these projects. But now there are many tools in place to help you manage open source projects easily. Github Actions is one of them. You can have various checks, tests that run automatically when an event is triggered such as pull request
or code push
. You can create a CODE_OF_CONDUCT
, and proper contributing guides, issue labels, and projects to efficiently manage your open source project.
Problems I faced along the way
API Not Found
I started working but immediately after starting, I faced my very first hurdle. I was not able to find an API for getting the data about the Nepal stock exchange. But after struggling for some time, I figured out the API by looking at the network
tab of dev tools.
Asynchronous code
I got started on my journey to getting Github stars. But the hurdles wouldn't stop. I had boldly claimed about making an asynchronous
module. But I didn't have much experience with asynchronous coding. But countless hours of StackOverflow, Googling, and Python discord server surfing really did help me. If you want a full guide on this, I may create a blog related to this.
Caching
I used cachetools for caching, but I was not sure which type of caching would be the best. I played around with LRU
(Least Recently Used) caching method. But LRU
had a fatal flaw for my use case. I wanted the cache to expire because the data was very volatile, but LRU
wouldn't be able to suffice that. So, I decided to use TTL
(Time To Live) caching method. Using TTL
meant that the cache would expire so which was perfect for my use case.
Legal issues
I don't know much about the IT policy or cyber laws of Nepal. So, when I heard that legal action could be taken against me just for making that python module, I was scared. But later I found out that, I won't be in any trouble because the API was public, and I had given reference and copyright to NEPSE for their data, and have never claimed their data as mine. A lesson for everyone to fact-check everything before believing anything blindly.
Problems I haven't been able to solve
Getting live market data
Even though I was able to get the market data for a company from New NepalStock Site, it just wasn't possible to get real-time market data from that site. The data there updated very slowly and real-time data was not much found there. While searching for the solution for this problem, I discovered Paid Nepse API which returned real-time data, but 10k-60k per month was too costly for a school student like me.
NEPSE changing their API
This has been one of the biggest problems for me while making this API wrapper. Within a week of making this API wrapper, NEPSE did some breaking changes to their API. They changed the request method from GET
to POST
and added a payload of id
is uniform across every request, but changes every time the id
in response of /market-open
endpoint changes. I created a mapping of \market-open
's id
to id
sent in the payload of the POST
request. But it didn't work out much because the numbers went up to 70 so it was not feasible to create a mapping for such huge numbers. I tried to figure out a mathematical relationship between those, but that didn't work as well. You can learn about the issue in detail here.
Conclusion
All of this was a bitter-sweet experience. I learned a lot of new things, I learned about properly working with asynchronous
code in python, creating context managers
, creating async generators
, creating documentation
, github actions
, and many more. I faced a ton of problems along the way, but those problems helped me grow even more towards being a better developer. In the end, I would just like to say, if you have any questions you can always DM in discord at sussy#0002
, and it would be very helpful of you if you could contribute in
nepse-api.