Predicting Salary based on Stackoverflow Survey Data
Image source: Self made
Summary
Last time we analysed the job satisfaction of the stackoverflow survey data, and we can't get any good models out of it. This time we try the salary. If you read the first post already (that you can find here: stackoverflow job statisfaction ) - you may find some informations twice - sorry.
Problem Domain
We will be talking about job salary, with a focus on coding oriented jobs. Salaries have several requirements that have to be taken into account:
Years of Experience
Education
Company Size
Frameworks / Tools / Languges used comparision to somebody else doing a similar job
Location
and even more.
All this will get represented by a single "number" (we wil be using anual USD). Its not easy to split the salary into the parts that give a more objective measurement and comparision with others.
Stack Overflow
Thanks to the stackoverflow and their community, they release every year a survey about the tech stack somebody is using, where they are working from and how much salary they get.
Problem Statement
Job Salary is always a big topic in the business platforms as LinkedIn and the popular XING (in german speaking region). For both services you have to may an monthly subscription in order to get the required information you are interisted in.
Salary was and will always be a strong argument to search or change a job. On the opposite side, not many people want to share this informations, as they prefer to have no information before they share their salary in exchange.
At the end of this blog post, you wil find a section where you can input your data and get the predicted salary back. This application will get the features as input using the html formular and return the prediction without doing any api calls further
Privacy
All potencial privat information will stay on the browser, so you don't have to worry about the data. I trainined using tensorflow 2.0 with the keras api und use tensorflow js to load the trained model and do predictions. I plan to write a blog post in more detail how to use tensorflow js.
Predicting Salary app
Stackoveflow capstone project by Darius Murawski
Welcome to your salary prediction. This data is based on the stackoverflow survey of 2020 - If you are interested in the source code, stay tuned!
Pick NA when you don't want to give an answer
CompFreq
Is that compensation weekly monthly or yearly?
CurrencySymbol
Which currency do you use day-to-day? If your answer is complicated please pick the one you're most comfortable estimating in.
CurrencyDesc
Which currency do you use day-to-day? If your answer is complicated please pick the one you're most comfortable estimating in.
NEWDevOpsImpt
How important is the practice of DevOps to scaling software development?
NEWDevOps
Does your company have a dedicated DevOps person?
NEWOnboardGood
Do you think your company has a good onboarding process? (By onboarding we mean the structured process of getting you settled in to your new role at a company)
Country
Where do you live?
PurchaseWhat
What level of influence do you personally have over new technology purchases at your organization?
NEWEdImpt
How important is a formal education such as a university degree in computer science to your career?
NEWJobHunt
In general what drives you to look for a new job? Select all that apply.
JobFactors
Imagine that you are deciding between two job offers with the same compensation benefits and location. Of the following factors which 3 are MOST important to you?
NEWJobHuntResearch
When job searching how do you learn more about a company? Select all that apply.
NEWCollabToolsWorkedWith
Which collaboration tools have you done extensive development work in over the past year and which do you want to work in over the next year? (If you worked with the tool and want to continue to do so please check both boxes in that row.)
WorkWeekHrs
On average how many hours per week do you work? Please enter a whole number in the box.
YearsCodePro
NOT including education how many years have you coded professionally (as a part of your work)?
YearsCode
Including any education how many years have you been coding in total?
Improvements
I picked the most relevant 15 features to have a good model and not much data the user have to add.
You can add more features like I did do have even more columns available for training a model, but remember that this will also increase the memory and runtime requirement for your model.
As the model was good, I rejected in using historical data as well or do any hyperparameter tuning.
Feedback
Like or don't like what you got? Lets have a chat and check if we can improve it!
Code
When you want to train the model on your side without the embeding of the blog, checkout the source code on my github repository