AI-Driven Software Pricing: An Integrated Approach with Prompt Engineering for Market Analysis

Gregory Fernandes Muniz; Joelcio de Carvalho Tonera; Rodrigo Perozzo Noll; Genizia Islabão de Islabão

doi:10.5753/jbcs.2026.6572

Authors

Gregory Fernandes Muniz Instituto Federal do Rio Grande do Sul (IFRS) https://orcid.org/0009-0006-6432-8242
Joelcio de Carvalho Tonera Instituto Federal do Rio Grande do Sul (IFRS) https://orcid.org/0009-0008-8520-4825
Rodrigo Perozzo Noll Instituto Federal do Rio Grande do Sul (IFRS) https://orcid.org/0000-0001-5658-6248
Genizia Islabão de Islabão Instituto Federal do Rio Grande do Sul (IFRS) https://orcid.org/0000-0002-0866-5766

DOI:

https://doi.org/10.5753/jbcs.2026.6572

Keywords:

Software pricing, Prompt engineering, LLM, Generative AI, Market analysis, Innovation management

Abstract

Software pricing based on valuation still represents a significant challenge due to its intangibility, variety of business models, and market volatility. This article discusses a pricing protocol by analogy mediated by language models (LLMs) and based on prompt engineering that explores public evidence (sitemaps, functional documentation, and competitor pricing pages). An experimental study was conducted with six software programs applying the same structured prompt in three LLMs, totaling 18 executions with standardized informational scope. The sample software consisted of 5 Innovation Management systems: INTEGRA, HYPE Innovation, IdeaScale, Viima/HYPE Boards, and Qmarkets, and one Customer Relationship Management (CRM) system: Salesforce. The 3 LLMs were: ChatGPT 5.1 Thinking, Gemini 3 Pro, and DeepSeek-V3.2. The LLMs extracted functionalities from sitemaps, mapped competitors, synthesized price benchmarks, and suggested market value ranges. The consolidated orders of magnitude converge, for example, to US$ 8,000–25,000/year in INTEGRA (per-instance license), ∼US$ 1,200–3,600 per user/year in Salesforce (per-seat model), and US$ 50,000–100,000/year in HYPE Innovation (enterprise license), with intermediate levels for IdeaScale (∼US$ 15,000–70,000/year), Viima HYPE Boards (∼US$ 6,000–18,000/year), and Qmarkets (∼US$ 30,000–55,000/year), in line with the functional depth and complexity of integrations observed. As a validation step, the estimates from the three LLMs were compared to actual quotations obtained from reference prices from public sources, after standardization (midpoint when a range existed; periodicity conversion to an annual basis and currency conversions when applicable). The evaluation of the results was done by verifying whether the annual market price was within the range estimated by each LLM (inside/outside the interval) and calculating the quotation (market price ÷ midpoint), as a percentage, as a measure of proximity to the midpoint. Under the interval coverage criterion, ChatGPT showed superiority (5/6), followed by Gemini (4/6) and DeepSeek (2/6), suggesting greater consistency of the first in proposing intervals compatible with the observed prices. Taken together, the results indicate convergence of orders of magnitude, albeit with occasional discrepancies, suggesting that the protocol is more suitable as an exploratory price screening tool, complementary to traditional methods. The main contribution lies in a reproducible and innovative protocol in which, from a single prompt applied in isolated conversations by software and by model, one obtains the functionalities extracted from the sitemap, the competitive benchmarking, the comparative table of functionalities, and the estimation of the market value, enabling a search and price analysis approach based on LLMs.

Downloads

Download data is not yet available.

References

Agrawal, A., Jain, N., and Sheikh, A. (2016). Software cost estimation using artificial neural networks. In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI). DOI: 10.1109/ICACCI.2016.7732254.

AI Anatomy Map (2020). The AI anatomy map. Retrieved December 23, 2025, from [link].

Alauthman, M., Ghanem, W., and Al-Dhaqm, A. (2023). A systematic literature review for just-in-time defect prediction. International Journal of Systems and Software Science and Computational Intelligence, 14(1):1-19. DOI: 10.4018/IJSSCI.328359.

Ali, S., Almajali, S., and Tahat, L. (2023). Artificial intelligence and ChatGPT: A review of the challenges and opportunities of AI-generated text. IEEE Access, 11:100774-100789. DOI: 10.1109/ACCESS.2023.3316530.

Baur, C., Groh, A., and Jung, F. (2014). Value-based pricing in digital services: A strategic pricing framework. Journal of Business Research, 67(5):976-982. DOI: 10.1016/j.jbusres.2013.08.007.

Bodendorf, F., Lutz, M., and Franke, J. (2021). Valuation and pricing of software licenses to support supplier-buyer negotiations: A case study in the automotive industry. Managerial and Decision Economics, 42(7):1686-1702. DOI: 10.1002/mde.3336.

Boussioux, L., Lai, Y., Malik, H., Menick, J., Nguyen, A., and Zoph, B. (2024). The cost of using AI for writing. Organization Science, 35(5):1589-1607.

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., and Amodei, D. (2020). Language models are few-shot learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877-1901.

Chang, Y., Wang, X., Wang, J., Wu, Y., Yang, L., Zhu, K., Chen, H., Yi, X., Wang, C., Wang, Y., Ye, W., Zhang, Y., Chang, Y., Yu, P. S., Yang, Q., and Xie, X. (2024). A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 15(3):1-45. DOI: 10.1145/3641289.

De Cremer, D., Mollick, E., and Bahadoor, S. (2023). How to use generative AI to augment your work. Harvard Business Review.

Dell'Acqua, F., Eling, M., Gaur, V., Lakhani, K., and Nori, H. (2023). Navigating the jagged technological frontier: Field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School Working Paper, (24-013).

Fan, Y., Liu, Y., Zhang, W., and Chen, H. (2024). A communication theory perspective on prompting engineering methods and measures for AI text generation: Proposal for a research agenda. International Journal of Human-Computer Interaction. DOI: 10.1080/10447318.2024.2316402.

Gao, J., Cao, Z., and Li, W. (2024). SelfCP: Compressing over-limit prompts via the frozen large language model itself. Information Processing & Management, 61(6):103873. DOI: 10.1016/j.ipm.2024.103873.

Harmon, R., Demirkan, H., Hefley, B., and Auseklis, N. (2009). Pricing strategies for information technology services: A value-based approach. Journal of Service Science, 2(2):33-50. DOI: 10.1287/serv.2.1_2.33.

Henrickson, L. and Meroño-Peñuela, A. (2023). Prompting meaning: A hermeneutic approach to optimizing prompt engineering with ChatGPT. AI & Society. DOI: 10.1007/s00146-023-01737-4.

Hoc, T., Brule, E., and Treco, E. (2023). Transfer learning in deep models for software effort estimation. Journal of Systems and Software, 196:111563. DOI: 10.1016/j.jss.2022.111563.

Holmström, J. and Carroll, N. (2024). How organizations can innovate with generative AI. Business Horizons. DOI: 10.1016/j.bushor.2024.02.010.

Huang, A. H. and Chang, K.-W. (2023). Fine-tuning and in-context learning with large language models for prompt engineering: A comparative analysis of performance and cost. Findings of the Association for Computational Linguistics: ACL 2023. DOI: 10.18653/v1/2023.findings-acl.67.

Huang, M.-H. and Rust, R. T. (2024). Generative artificial intelligence in marketing: A framework for research and applications. Journal of Marketing, 88(1):53-77.

Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y. J., Madotto, A., and Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):Article 248. DOI: 10.1145/3571730.

Jørgensen, M. and Shepperd, M. (2007). A systematic review of software development cost estimation studies. IEEE Transactions on Software Engineering, 33(1):33-53. DOI: 10.1109/TSE.2007.256943.

Kietzmann, J. and Park, C. W. (2024). Written by ChatGPT: AI, large language models, conversational chatbots, and their place in society and business. Business Horizons, 67(5):453-459. DOI: 10.1016/j.bushor.2024.06.002.

Korzynski, P., Mazurek, G., and Haenlein, M. (2023). Leveraging large language models for open source intelligence. Entrepreneurship and Business Economics Review, 11(3):131-152. DOI: 10.1007/s40821-023-00226-1.

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., and Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1-35. DOI: 10.1145/3560815.

Liu, X., Zhang, Y., Chen, W., and Liu, Y. (2024). Jailbreak and adversarial prompt injection attacks: Understanding security vulnerabilities in LLMs. arXiv preprint.

López-Martín, C. (2015). Predictive accuracy comparison between neural networks and statistical regression for development effort of software projects. Applied Soft Computing, 27:434-449. DOI: 10.1016/j.asoc.2014.10.030.

MacRae, M. (2023). How generative AI is changing software development. MIT Sloan Management Review.

Oppenlaender, J. (2024). Prompt engineering for text-based generative AI: A (literary) perspective on prompt modifiers. International Journal of Human-Computer Interaction. DOI: 10.1080/10447318.2024.2431761.

Qassem, M. and Saleh, A. (2023). Impact of machine learning techniques on software effort estimation. International Research Journal of Innovations in Engineering and Technology, 7(1):68-74.

Qin, Y., Hu, S., Lin, Y., Chen, W., Ding, N., Cui, G., Zeng, Z., Huang, Y., Xiao, C., Han, C., Fung, Y. R., Su, Y., Wang, H., Qian, C., Shi, R., Zheng, R., Liu, Z., Zhou, J., Zhang, P., Sun, M., and Liu, Z. (2024). Tool learning with foundation models. ACM Computing Surveys, 57(3):1-38. DOI: 10.1145/3704435.

Rankovic, N., Miskovic, V., and Jovanovic, M. (2021). A hybrid ANN approach for software cost estimation. IEEE Access, 9:153737-153748. DOI: 10.1109/ACCESS.2021.3127958.

Rashid, M., Riaz, M. R., Ahmad, S., and Khan, S. (2025). A systematic literature review on software cost estimation models: Evolution and emerging trends. Alexandria Engineering Journal, 102:162-170. DOI: 10.1016/j.aej.2025.02.064.

Robertson, J., Prado, M., and Nielsen, D. (2024). Prompt engineering: The art and science of asking better questions. Business Horizons, 67(4):409-418. DOI: 10.1016/j.bushor.2024.03.008.

Saljoughinejad, S. and Khatibi, V. (2018). A comparative analysis of COCOMO-based estimation models. Software Quality Journal, 26(2):399-421. DOI: 10.1007/s11219-016-9339-5.

Santaella, L. (2023). Artificial intelligence and daily life: From background algorithms to generative systems. Journal of Digital Studies, 2(1):1-22.

Short, J. and Short, T. (2023). Real or fake? How artificial intelligence can enhance corporate communication. Journal of Business Venturing Insights, 20:e00315. DOI: 10.1016/j.jbvi.2023.e00315.

Sun, Z., Wang, X., Tay, Y., Yang, Y., and Zhou, D. (2023). Recitation-augmented language models. arXiv preprint.

Sundberg, L. and Holmström, J. (2024). Prompt engineering: The art of asking the right questions. Business Horizons, 67(5):561-570. DOI: 10.1016/j.bushor.2024.04.014.

Verner, J. M., Sampson, J., and Cerpa, N. (2008). What factors lead to software project failure? In 2008 Second International Conference on Research Challenges in Information Science, pages 71-80. DOI: 10.1109/RCIS.2008.4632095.

Villalobos-Arias, L., Quesada-López, C., Martínez, A., and Jenkins, M. (2020). Evaluating hyper-parameter tuning using random search in support vector machines for software effort estimation. In Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering, pages 31-40. DOI: 10.1145/3408301.3408305.

Wang, B., Min, S., Hou, X., Chen, L., Hu, S., Chen, J., Zhang, W., Zhou, J., Peng, J., Zhao, Y., Hao, J., and Zhang, J. (2023). Towards understanding chain-of-thought prompting: An empirical study of what matters. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1740-1762. DOI: 10.18653/v1/2023.acl-long.153.

Wei, J., Tay, Y., Bommasani, R., Raffel, C., Zoph, B., Borgeaud, S., Yogatama, D., Bosma, M., Zhou, D., Metzler, D., Chi, E. H., Hashimoto, T., Vinyals, O., Liang, P., Dean, J., and Fedus, W. (2022). Emergent abilities of large language models. Transactions on Machine Learning Research.

Wen, J., Li, S., Lin, Z., Hu, Y., and Huang, C. (2012). Systematic literature review of machine learning based software development effort estimation models. Information and Software Technology, 54(1):41-59. DOI: 10.1016/j.infsof.2011.09.002.

Yang, L., Zhang, S., Wang, Y., and Li, Y. (2025). Robust prompting practices and architectures for LLM controllability with external knowledge integration. AI and Ethics, 5(1):89-105. DOI: 10.1007/s43681-024-00456-3.

Zhang, D., Liu, Y., Li, X., and Wang, H. (2024). Mixed data classification of clinical notes in electronic medical records. Journal of Biomedical Informatics, 149:104571. DOI: 10.1016/j.jbi.2023.104571.

Zhang, Y., Peng, N., Li, X., and Wang, C. (2023). Improving GPT-4 performance on clinical note classification with structured prompts and explicit reasoning steps. Journal of Biomedical Informatics, 146:104504. DOI: 10.1016/j.jbi.2023.104504.